101
|
Fraccaro P, Arguello Casteleiro M, Ainsworth J, Buchan I. Adoption of clinical decision support in multimorbidity: a systematic review. JMIR Med Inform 2015; 3:e4. [PMID: 25785897 PMCID: PMC4318680 DOI: 10.2196/medinform.3503] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Revised: 09/26/2014] [Accepted: 11/08/2014] [Indexed: 11/18/2022] Open
Abstract
Background Patients with multiple conditions have complex needs and are increasing in number as populations age. This multimorbidity is one of the greatest challenges facing health care. Having more than 1 condition generates (1) interactions between pathologies, (2) duplication of tests, (3) difficulties in adhering to often conflicting clinical practice guidelines, (4) obstacles in the continuity of care, (5) confusing self-management information, and (6) medication errors. In this context, clinical decision support (CDS) systems need to be able to handle realistic complexity and minimize iatrogenic risks. Objective The aim of this review was to identify to what extent CDS is adopted in multimorbidity. Methods This review followed PRISMA guidance and adopted a multidisciplinary approach. Scopus and PubMed searches were performed by combining terms from 3 different thesauri containing synonyms for (1) multimorbidity and comorbidity, (2) polypharmacy, and (3) CDS. The relevant articles were identified by examining the titles and abstracts. The full text of selected/relevant articles was analyzed in-depth. For articles appropriate for this review, data were collected on clinical tasks, diseases, decision maker, methods, data input context, user interface considerations, and evaluation of effectiveness. Results A total of 50 articles were selected for the full in-depth analysis and 20 studies were included in the final review. Medication (n=10) and clinical guidance (n=8) were the predominant clinical tasks. Four studies focused on merging concurrent clinical practice guidelines. A total of 17 articles reported their CDS systems were knowledge-based. Most articles reviewed considered patients’ clinical records (n=19), clinical practice guidelines (n=12), and clinicians’ knowledge (n=10) as contextual input data. The most frequent diseases mentioned were cardiovascular (n=9) and diabetes mellitus (n=5). In all, 12 articles mentioned generalist doctor(s) as the decision maker(s). For articles reviewed, there were no studies referring to the active involvement of the patient in the decision-making process or to patient self-management. None of the articles reviewed adopted mobile technologies. There were no rigorous evaluations of usability or effectiveness of the CDS systems reported. Conclusions This review shows that multimorbidity is underinvestigated in the informatics of supporting clinical decisions. CDS interventions that systematize clinical practice guidelines without considering the interactions of different conditions and care processes may lead to unhelpful or harmful clinical actions. To improve patient safety in multimorbidity, there is a need for more evidence about how both conditions and care processes interact. The data needed to build this evidence base exist in many electronic health record systems and are underused.
Collapse
Affiliation(s)
- Paolo Fraccaro
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, Institute of Population Health, The University of Manchester, Manchester, United Kingdom.
| | | | | | | |
Collapse
|
102
|
Li Y, Yu H. A robust data-driven approach for gene ontology annotation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau113. [PMID: 25425037 PMCID: PMC4243380 DOI: 10.1093/database/bau113] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Gene ontology (GO) and GO annotation are important resources for biological information management and knowledge discovery, but the speed of manual annotation became a major bottleneck of database curation. BioCreative IV GO annotation task aims to evaluate the performance of system that automatically assigns GO terms to genes based on the narrative sentences in biomedical literature. This article presents our work in this task as well as the experimental results after the competition. For the evidence sentence extraction subtask, we built a binary classifier to identify evidence sentences using reference distance estimator (RDE), a recently proposed semi-supervised learning method that learns new features from around 10 million unlabeled sentences, achieving an F1 of 19.3% in exact match and 32.5% in relaxed match. In the post-submission experiment, we obtained 22.1% and 35.7% F1 performance by incorporating bigram features in RDE learning. In both development and test sets, RDE-based method achieved over 20% relative improvement on F1 and AUC performance against classical supervised learning methods, e.g. support vector machine and logistic regression. For the GO term prediction subtask, we developed an information retrieval-based method to retrieve the GO term most relevant to each evidence sentence using a ranking function that combined cosine similarity and the frequency of GO terms in documents, and a filtering method based on high-level GO classes. The best performance of our submitted runs was 7.8% F1 and 22.2% hierarchy F1. We found that the incorporation of frequency information and hierarchy filtering substantially improved the performance. In the post-submission evaluation, we obtained a 10.6% F1 using a simpler setting. Overall, the experimental analysis showed our approaches were robust in both the two tasks.
Collapse
Affiliation(s)
- Yanpeng Li
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA, Department of Computer Science, University of Massachusetts, Amherst, MA, USA and VA Central Western Massachusetts, Worcester, MA, USA
| | - Hong Yu
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA, Department of Computer Science, University of Massachusetts, Amherst, MA, USA and VA Central Western Massachusetts, Worcester, MA, USA Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA, Department of Computer Science, University of Massachusetts, Amherst, MA, USA and VA Central Western Massachusetts, Worcester, MA, USA Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA, Department of Computer Science, University of Massachusetts, Amherst, MA, USA and VA Central Western Massachusetts, Worcester, MA, USA
| |
Collapse
|
103
|
Kurtz C, Depeursinge A, Napel S, Beaulieu CF, Rubin DL. On combining image-based and ontological semantic dissimilarities for medical image retrieval applications. Med Image Anal 2014; 18:1082-100. [PMID: 25036769 PMCID: PMC4173098 DOI: 10.1016/j.media.2014.06.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 06/18/2014] [Accepted: 06/23/2014] [Indexed: 10/25/2022]
Abstract
Computer-assisted image retrieval applications can assist radiologists by identifying similar images in archives as a means to providing decision support. In the classical case, images are described using low-level features extracted from their contents, and an appropriate distance is used to find the best matches in the feature space. However, using low-level image features to fully capture the visual appearance of diseases is challenging and the semantic gap between these features and the high-level visual concepts in radiology may impair the system performance. To deal with this issue, the use of semantic terms to provide high-level descriptions of radiological image contents has recently been advocated. Nevertheless, most of the existing semantic image retrieval strategies are limited by two factors: they require manual annotation of the images using semantic terms and they ignore the intrinsic visual and semantic relationships between these annotations during the comparison of the images. Based on these considerations, we propose an image retrieval framework based on semantic features that relies on two main strategies: (1) automatic "soft" prediction of ontological terms that describe the image contents from multi-scale Riesz wavelets and (2) retrieval of similar images by evaluating the similarity between their annotations using a new term dissimilarity measure, which takes into account both image-based and ontological term relations. The combination of these strategies provides a means of accurately retrieving similar images in databases based on image annotations and can be considered as a potential solution to the semantic gap problem. We validated this approach in the context of the retrieval of liver lesions from computed tomographic (CT) images and annotated with semantic terms of the RadLex ontology. The relevance of the retrieval results was assessed using two protocols: evaluation relative to a dissimilarity reference standard defined for pairs of images on a 25-images dataset, and evaluation relative to the diagnoses of the retrieved images on a 72-images dataset. A normalized discounted cumulative gain (NDCG) score of more than 0.92 was obtained with the first protocol, while AUC scores of more than 0.77 were obtained with the second protocol. This automatical approach could provide real-time decision support to radiologists by showing them similar images with associated diagnoses and, where available, responses to therapies.
Collapse
Affiliation(s)
- Camille Kurtz
- Department of Radiology, School of Medicine, Stanford University, USA; LIPADE Laboratory (EA 2517), University Paris Descartes, France.
| | | | - Sandy Napel
- Department of Radiology, School of Medicine, Stanford University, USA.
| | | | - Daniel L Rubin
- Department of Radiology, School of Medicine, Stanford University, USA.
| |
Collapse
|
104
|
Bohland JW, Myers EM, Kim E. An informatics approach to integrating genetic and neurological data in speech and language neuroscience. Neuroinformatics 2014; 12:39-62. [PMID: 23949335 DOI: 10.1007/s12021-013-9201-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
A number of heritable disorders impair the normal development of speech and language processes and occur in large numbers within the general population. While candidate genes and loci have been identified, the gap between genotype and phenotype is vast, limiting current understanding of the biology of normal and disordered processes. This gap exists not only in our scientific knowledge, but also in our research communities, where genetics researchers and speech, language, and cognitive scientists tend to operate independently. Here we describe a web-based, domain-specific, curated database that represents information about genotype-phenotype relations specific to speech and language disorders, as well as neuroimaging results demonstrating focal brain differences in relevant patients versus controls. Bringing these two distinct data types into a common database ( http://neurospeech.org/sldb ) is a first step toward bringing molecular level information into cognitive and computational theories of speech and language function. One bridge between these data types is provided by densely sampled profiles of gene expression in the brain, such as those provided by the Allen Brain Atlases. Here we present results from exploratory analyses of human brain gene expression profiles for genes implicated in speech and language disorders, which are annotated in our database. We then discuss how such datasets can be useful in the development of computational models that bridge levels of analysis, necessary to provide a mechanistic understanding of heritable language disorders. We further describe our general approach to information integration, discuss important caveats and considerations, and offer a specific but speculative example based on genes implicated in stuttering and basal ganglia function in speech motor control.
Collapse
Affiliation(s)
- Jason W Bohland
- Departments of Health Sciences and Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Ave, Room 403, Boston, MA, 02215, USA,
| | | | | |
Collapse
|
105
|
Fajardo-Ortiz D, Duran L, Moreno L, Ochoa H, Castaño VM. Mapping knowledge translation and innovation processes in Cancer Drug Development: the case of liposomal doxorubicin. J Transl Med 2014; 12:227. [PMID: 25182125 PMCID: PMC4161884 DOI: 10.1186/s12967-014-0227-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 08/07/2014] [Indexed: 11/18/2022] Open
Abstract
We explored how the knowledge translation and innovation processes are structured when theyresult in innovations, as in the case of liposomal doxorubicin research. In order to map the processes, a literature network analysis was made through Cytoscape and semantic analysis was performed by GOPubmed which is based in the controlled vocabularies MeSH (Medical Subject Headings) and GO (Gene Ontology). We found clusters related to different stages of the technological development (invention, innovation and imitation) and the knowledge translation process (preclinical, translational and clinical research), and we were able to map the historic emergence of Doxil as a paradigmatic nanodrug. This research could be a powerful methodological tool for decision-making and innovation management in drug delivery research.
Collapse
Affiliation(s)
| | | | | | | | - Victor M Castaño
- Centro de Fisica Aplicada y Tecnologia Avanzada, Universidad Nacional Autonoma de Mexico, Queretaro, Mexico.
| |
Collapse
|
106
|
Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev 2014; 3:74. [PMID: 25005128 PMCID: PMC4100748 DOI: 10.1186/2046-4053-3-74] [Citation(s) in RCA: 203] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Accepted: 06/26/2014] [Indexed: 02/08/2023] Open
Abstract
Systematic reviews, a cornerstone of evidence-based medicine, are not produced quickly enough to support clinical practice. The cost of production, availability of the requisite expertise and timeliness are often quoted as major contributors for the delay. This detailed survey of the state of the art of information systems designed to support or automate individual tasks in the systematic review, and in particular systematic reviews of randomized controlled clinical trials, reveals trends that see the convergence of several parallel research projects.We surveyed literature describing informatics systems that support or automate the processes of systematic review or each of the tasks of the systematic review. Several projects focus on automating, simplifying and/or streamlining specific tasks of the systematic review. Some tasks are already fully automated while others are still largely manual. In this review, we describe each task and the effect that its automation would have on the entire systematic review process, summarize the existing information system support for each task, and highlight where further research is needed for realizing automation for the task. Integration of the systems that automate systematic review tasks may lead to a revised systematic review workflow. We envisage the optimized workflow will lead to system in which each systematic review is described as a computer program that automatically retrieves relevant trials, appraises them, extracts and synthesizes data, evaluates the risk of bias, performs meta-analysis calculations, and produces a report in real time.
Collapse
Affiliation(s)
- Guy Tsafnat
- Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney, Australia
| | - Paul Glasziou
- Centre for Research on Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Miew Keen Choong
- Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney, Australia
| | - Adam Dunn
- Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney, Australia
| | - Filippo Galgani
- Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney, Australia
| | - Enrico Coiera
- Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney, Australia
| |
Collapse
|
107
|
Human symptoms-disease network. Nat Commun 2014; 5:4212. [PMID: 24967666 DOI: 10.1038/ncomms5212] [Citation(s) in RCA: 350] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Accepted: 05/27/2014] [Indexed: 12/19/2022] Open
Abstract
In the post-genomic era, the elucidation of the relationship between the molecular origins of diseases and their resulting phenotypes is a crucial task for medical research. Here, we use a large-scale biomedical literature database to construct a symptom-based human disease network and investigate the connection between clinical manifestations of diseases and their underlying molecular interactions. We find that the symptom-based similarity of two diseases correlates strongly with the number of shared genetic associations and the extent to which their associated proteins interact. Moreover, the diversity of the clinical manifestations of a disease can be related to the connectivity patterns of the underlying protein interaction network. The comprehensive, high-quality map of disease-symptom relations can further be used as a resource helping to address important questions in the field of systems medicine, for example, the identification of unexpected associations between diseases, disease etiology research or drug design.
Collapse
|
108
|
Cheng L, Li J, Ju P, Peng J, Wang Y. SemFunSim: a new method for measuring disease similarity by integrating semantic and gene functional association. PLoS One 2014; 9:e99415. [PMID: 24932637 PMCID: PMC4059643 DOI: 10.1371/journal.pone.0099415] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 05/14/2014] [Indexed: 01/20/2023] Open
Abstract
Background Measuring similarity between diseases plays an important role in disease-related molecular function research. Functional associations between disease-related genes and semantic associations between diseases are often used to identify pairs of similar diseases from different perspectives. Currently, it is still a challenge to exploit both of them to calculate disease similarity. Therefore, a new method (SemFunSim) that integrates semantic and functional association is proposed to address the issue. Methods SemFunSim is designed as follows. First of all, FunSim (Functional similarity) is proposed to calculate disease similarity using disease-related gene sets in a weighted network of human gene function. Next, SemSim (Semantic Similarity) is devised to calculate disease similarity using the relationship between two diseases from Disease Ontology. Finally, FunSim and SemSim are integrated to measure disease similarity. Results The high average AUC (area under the receiver operating characteristic curve) (96.37%) shows that SemFunSim achieves a high true positive rate and a low false positive rate. 79 of the top 100 pairs of similar diseases identified by SemFunSim are annotated in the Comparative Toxicogenomics Database (CTD) as being targeted by the same therapeutic compounds, while other methods we compared could identify 35 or less such pairs among the top 100. Moreover, when using our method on diseases without annotated compounds in CTD, we could confirm many of our predicted candidate compounds from literature. This indicates that SemFunSim is an effective method for drug repositioning.
Collapse
Affiliation(s)
- Liang Cheng
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Jie Li
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Peng Ju
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Jiajie Peng
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yadong Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| |
Collapse
|
109
|
Gavel Y, Andersson PO. Multilingual query expansion in the SveMed+ bibliographic database: A case study. J Inf Sci 2014. [DOI: 10.1177/0165551514524685] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
SveMed+ is a bibliographic database covering Scandinavian medical journals. It is produced by the University Library of Karolinska Institutet in Sweden. The bibliographic references are indexed with terms from the Medical Subject Headings (MeSH) thesaurus. The MeSH has been translated into several languages, including Swedish, making it suitable as the basis for multilingual tools in the medical field. The data structure of SveMed+ closely mimics that of PubMed/MEDLINE. Users of PubMed/MEDLINE and similar databases typically expect retrieval features that are not readily available off-the-shelf. The SveMed+ interface is based on a free text search engine (Solr) and a relational database management system (Microsoft SQL Server) containing the bibliographic database and a multilingual thesaurus database. The thesaurus database contains medical terms in three different languages and information about relationships between the terms. A combined approach involving the Solr free text index, the bibliographic database and the thesaurus database allowed the implementation of functionality such as automatic multilingual query expansion, faceting and hierarchical explode searches. The present paper describes how this was done in practice.
Collapse
Affiliation(s)
- Ylva Gavel
- Karolinska Institutet University Library, Stockholm, Sweden
| | | |
Collapse
|
110
|
Field N, Cohen T, Struelens MJ, Palm D, Cookson B, Glynn JR, Gallo V, Ramsay M, Sonnenberg P, MacCannell D, Charlett A, Egger M, Green J, Vineis P, Abubakar I. Strengthening the Reporting of Molecular Epidemiology for Infectious Diseases (STROME-ID): an extension of the STROBE statement. THE LANCET. INFECTIOUS DISEASES 2014; 14:341-52. [DOI: 10.1016/s1473-3099(13)70324-4] [Citation(s) in RCA: 156] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
111
|
Tsafnat G, Jasch D, Misra A, Choong MK, Lin FPY, Coiera E. Gene-disease association with literature based enrichment. J Biomed Inform 2014; 49:221-6. [PMID: 24681202 DOI: 10.1016/j.jbi.2014.03.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Revised: 02/09/2014] [Accepted: 03/02/2014] [Indexed: 10/25/2022]
Abstract
MOTIVATION Gene set enrichment analysis (GSEA) annotates gene microarray data with functional information from the biomedical literature to improve gene-disease association prediction. We hypothesize that supplementing GSEA with comprehensive gene function catalogs built automatically using information extracted from the scientific literature will significantly enhance GSEA prediction quality. METHODS Gold standard gene sets for breast cancer (BrCa) and colorectal cancer (CRC) were derived from the literature. Two gene function catalogs (CMeSH and CUMLS) were automatically generated. 1. By using Entrez Gene to associate all recorded human genes with PubMed article IDs. 2. Using the genes mentioned in each PubMed article and associating each with the article's MeSH terms (in CMeSH) and extracted UMLS concepts (in CUMLS). Microarray data from the Gene Expression Omnibus for BrCa and CRC was then annotated using CMeSH and CUMLS and for comparison, also with several pre-existing catalogs (C2, C4 and C5 from the Molecular Signatures Database). Ranking was done using, a standard GSEA implementation (GSEA-p). Gene function predictions for enriched array data were evaluated against the gold standard by measuring area under the receiver operating characteristic curve (AUC). RESULTS Comparison of ranking using the literature enrichment catalogs, the pre-existing catalogs as well as five randomly generated catalogs show the literature derived enrichment catalogs are more effective. The AUC for BrCa using the unenriched gene expression dataset was 0.43, increasing to 0.89 after gene set enrichment with CUMLS. The AUC for CRC using the unenriched gene expression dataset was 0.54, increasing to 0.9 after enrichment with CMeSH. C2 increased AUC (BrCa 0.76, CRC 0.71) but C4 and C5 performed poorly (between 0.35 and 0.5). The randomly generated catalogs also performed poorly, equivalent to random guessing. DISCUSSION Gene set enrichment significantly improved prediction of gene-disease association. Selection of enrichment catalog had a substantial effect on prediction accuracy. The literature based catalogs performed better than the MSigDB catalogs, possibly because they are more recent. Catalogs generated automatically from the literature can be kept up to date. CONCLUSION Prediction of gene-disease association is a fundamental task in biomedical research. GSEA provides a promising method when using literature-based enrichment catalogs. AVAILABILITY The literature based catalogs generated and used in this study are available from http://www2.chi.unsw.edu.au/literature-enrichment.
Collapse
Affiliation(s)
- Guy Tsafnat
- Centre for Health Informatics, University of New South Wales, Sydney, Australia.
| | - Dennis Jasch
- Centre for Health Informatics, University of New South Wales, Sydney, Australia
| | - Agam Misra
- Centre for Health Informatics, University of New South Wales, Sydney, Australia
| | - Miew Keen Choong
- Centre for Health Informatics, University of New South Wales, Sydney, Australia
| | - Frank P-Y Lin
- Centre for Health Informatics, University of New South Wales, Sydney, Australia
| | - Enrico Coiera
- Centre for Health Informatics, University of New South Wales, Sydney, Australia
| |
Collapse
|
112
|
Li L, Zhang P, Zheng T, Zhang H, Jiang Z, Huang D. Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures. PLoS One 2014; 9:e91898. [PMID: 24622773 PMCID: PMC3951470 DOI: 10.1371/journal.pone.0091898] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 01/24/2014] [Indexed: 11/19/2022] Open
Abstract
Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.
Collapse
Affiliation(s)
- Lishuang Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Panpan Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Tianfu Zheng
- Faculty of Chemical, Environmental and Biological Science and Technology, Dalian University of Technology, Dalian, China
| | - Hongying Zhang
- Department of Pathology, Dalian Medical University, Dalian, China
| | - Zhenchao Jiang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Degen Huang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
113
|
Quantitative imaging biomarker ontology (QIBO) for knowledge representation of biomedical imaging biomarkers. J Digit Imaging 2014; 26:630-41. [PMID: 23589184 DOI: 10.1007/s10278-013-9599-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
A widening array of novel imaging biomarkers is being developed using ever more powerful clinical and preclinical imaging modalities. These biomarkers have demonstrated effectiveness in quantifying biological processes as they occur in vivo and in the early prediction of therapeutic outcomes. However, quantitative imaging biomarker data and knowledge are not standardized, representing a critical barrier to accumulating medical knowledge based on quantitative imaging data. We use an ontology to represent, integrate, and harmonize heterogeneous knowledge across the domain of imaging biomarkers. This advances the goal of developing applications to (1) improve precision and recall of storage and retrieval of quantitative imaging-related data using standardized terminology; (2) streamline the discovery and development of novel imaging biomarkers by normalizing knowledge across heterogeneous resources; (3) effectively annotate imaging experiments thus aiding comprehension, re-use, and reproducibility; and (4) provide validation frameworks through rigorous specification as a basis for testable hypotheses and compliance tests. We have developed the Quantitative Imaging Biomarker Ontology (QIBO), which currently consists of 488 terms spanning the following upper classes: experimental subject, biological intervention, imaging agent, imaging instrument, image post-processing algorithm, biological target, indicated biology, and biomarker application. We have demonstrated that QIBO can be used to annotate imaging experiments with standardized terms in the ontology and to generate hypotheses for novel imaging biomarker-disease associations. Our results established the utility of QIBO in enabling integrated analysis of quantitative imaging data.
Collapse
|
114
|
Turinsky AL, Razick S, Turner B, Donaldson IM, Wodak SJ. Navigating the global protein-protein interaction landscape using iRefWeb. Methods Mol Biol 2014; 1091:315-31. [PMID: 24203342 DOI: 10.1007/978-1-62703-691-7_22] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
iRefWeb is a bioinformatics resource that offers access to a large collection of data on protein-protein interactions in over a thousand organisms. This collection is consolidated from 14 major public databases that curate the scientific literature. The collection is enhanced with a range of versatile data filters and search options that categorize various types of protein-protein interactions and protein complexes. Users of iRefWeb are able to retrieve all curated interactions for a given organism or those involving a given protein (or a list of proteins), narrow down their search results based on different supporting evidence, and assess the reliability of these interactions using various criteria. They may also examine all data and annotations related to any publication that described the interaction-detection experiments. iRefWeb is freely available to the research community worldwide at http://wodaklab.org/iRefWeb .
Collapse
Affiliation(s)
- Andrei L Turinsky
- Molecular Structure and Function program, Hospital for Sick Children, Toronto, ON, Canada
| | | | | | | | | |
Collapse
|
115
|
Budovec JJ, Lam CA, Kahn CE. Informatics in Radiology: Radiology Gamuts Ontology: Differential Diagnosis for the Semantic Web. Radiographics 2014; 34:254-64. [DOI: 10.1148/rg.341135036] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Joseph J Budovec
- From the Department of Radiology, Medical College of Wisconsin, 9200 W Wisconsin Ave, Milwaukee, WI 53226
| | | | | |
Collapse
|
116
|
Pombo N, Araújo P, Viana J. Knowledge discovery in clinical decision support systems for pain management: a systematic review. Artif Intell Med 2013; 60:1-11. [PMID: 24370382 DOI: 10.1016/j.artmed.2013.11.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Revised: 11/18/2013] [Accepted: 11/29/2013] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The occurrence of pain accounts for billions of dollars in annual medical expenditures; loss of quality of life and decreased worker productivity contribute to indirect costs. As pain is highly subjective, clinical decision support systems (CDSSs) can be critical for improving the accuracy of pain assessment and offering better support for clinical decision-making. This review is focused on computer technologies for pain management that allow CDSSs to obtain knowledge from the clinical data produced by either patients or health care professionals. METHODS AND MATERIALS A comprehensive literature search was conducted in several electronic databases to identify relevant articles focused on computerised systems that constituted CDSSs and include data or results related to pain symptoms from patients with acute or chronic pain, published between 1992 and 2011 in the English language. In total, thirty-nine studies were analysed; thirty-two were selected from 1245 citations, and seven were obtained from reference tracking. RESULTS The results highlighted the following clusters of computer technologies: rule-based algorithms, artificial neural networks, nonstandard set theory, and statistical learning algorithms. In addition, several methodologies were found for content processing such as terminologies, questionnaires, and scores. The median accuracy ranged from 53% to 87.5%. CONCLUSIONS Computer technologies that have been applied in CDSSs are important but not determinant in improving the systems' accuracy and the clinical practice, as evidenced by the moderate correlation among the studies. However, these systems play an important role in the design of computerised systems oriented to a patient's symptoms as is required for pain management. Several limitations related to CDSSs were observed: the lack of integration with mobile devices, the reduced use of web-based interfaces, and scarce capabilities for data to be inserted by patients.
Collapse
Affiliation(s)
- Nuno Pombo
- Department of Informatics, University of Beira Interior, Rua Marquês de Ávila e Bolama, 6201-001 Covilhã, Portugal.
| | - Pedro Araújo
- Instituto de Telecomunicações and Department of Informatics, University of Beira Interior, Rua Marquês de Ávila e Bolama, 6201-001 Covilhã, Portugal
| | - Joaquim Viana
- Faculty of Health Sciences, University of Beira Interior, Av. Infante D. Henrique, 6200-506 Covilhã, Portugal
| |
Collapse
|
117
|
de la Iglesia D, Cachau RE, García-Remesal M, Maojo V. Nanoinformatics knowledge infrastructures: bringing efficient information management to nanomedical research. COMPUTATIONAL SCIENCE & DISCOVERY 2013; 6:014011. [PMID: 24932210 PMCID: PMC4053539 DOI: 10.1088/1749-4699/6/1/014011] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
Collapse
Affiliation(s)
- D de la Iglesia
- Biomedical Informatics Group, Dept. Inteligencia Artificial, Facultad de Informatica, Universidad Politecnica de Madrid, 28660, Boadilla del Monte, Madrid, Spain
| | - R E Cachau
- Advanced Biomedical Computing Center, National Cancer Institute, SAIC-Frederick Inc., Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| | - M García-Remesal
- Biomedical Informatics Group, Dept. Inteligencia Artificial, Facultad de Informatica, Universidad Politecnica de Madrid, 28660, Boadilla del Monte, Madrid, Spain
| | - V Maojo
- Biomedical Informatics Group, Dept. Inteligencia Artificial, Facultad de Informatica, Universidad Politecnica de Madrid, 28660, Boadilla del Monte, Madrid, Spain
| |
Collapse
|
118
|
Cheng L, Wang G, Li J, Zhang T, Xu P, Wang Y. SIDD: a semantically integrated database towards a global view of human disease. PLoS One 2013; 8:e75504. [PMID: 24146757 PMCID: PMC3795748 DOI: 10.1371/journal.pone.0075504] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2013] [Accepted: 08/15/2013] [Indexed: 01/08/2023] Open
Abstract
Background A number of databases have been developed to collect disease-related molecular, phenotypic and environmental features (DR-MPEs), such as genes, non-coding RNAs, genetic variations, drugs, phenotypes and environmental factors. However, each of current databases focused on only one or two DR-MPEs. There is an urgent demand to develop an integrated database, which can establish semantic associations among disease-related databases and link them to provide a global view of human disease at the biological level. This database, once developed, will facilitate researchers to query various DR-MPEs through disease, and investigate disease mechanisms from different types of data. Methodology To establish an integrated disease-associated database, disease vocabularies used in different databases are mapped to Disease Ontology (DO) through semantic match. 4,284 and 4,186 disease terms from Medical Subject Headings (MeSH) and Online Mendelian Inheritance in Man (OMIM) respectively are mapped to DO. Then, the relationships between DR-MPEs and diseases are extracted and merged from different source databases for reducing the data redundancy. Conclusions A semantically integrated disease-associated database (SIDD) is developed, which integrates 18 disease-associated databases, for researchers to browse multiple types of DR-MPEs in a view. A web interface allows easy navigation for querying information through browsing a disease ontology tree or searching a disease term. Furthermore, a network visualization tool using Cytoscape Web plugin has been implemented in SIDD. It enhances the SIDD usage when viewing the relationships between diseases and DR-MPEs. The current version of SIDD (Jul 2013) documents 4,465,131 entries relating to 139,365 DR-MPEs, and to 3,824 human diseases. The database can be freely accessed from: http://mlg.hit.edu.cn/SIDD.
Collapse
Affiliation(s)
- Liang Cheng
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Guohua Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Jie Li
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tianjiao Zhang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Peigang Xu
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yadong Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
- * E-mail:
| |
Collapse
|
119
|
Rende D, Baysal N, Kirdar B. Complex disease interventions from a network model for type 2 diabetes. PLoS One 2013; 8:e65854. [PMID: 23776558 PMCID: PMC3679160 DOI: 10.1371/journal.pone.0065854] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 05/02/2013] [Indexed: 12/20/2022] Open
Abstract
There is accumulating evidence that the proteins encoded by the genes associated with a common disorder interact with each other, participate in similar pathways and share GO terms. It has been anticipated that the functional modules in a disease related functional linkage network are informative to reveal significant metabolic processes and disease's associations with other complex disorders. In the current study, Type 2 diabetes associated functional linkage network (T2DFN) containing 2770 proteins and 15041 linkages was constructed. The functional modules in this network were scored and evaluated in terms of shared pathways, co-localization, co-expression and associations with similar diseases. The assembly of top scoring overlapping members in the functional modules revealed that, along with the well known biological pathways, circadian rhythm, diverse actions of nuclear receptors in steroid and retinoic acid metabolisms have significant occurrence in the pathophysiology of the disease. The disease's association with other metabolic and neuromuscular disorders was established through shared proteins. Nuclear receptor NRIP1 has a pivotal role in lipid and carbohydrate metabolism, indicating the need to investigate subsequent effects of NRIP1 on Type 2 diabetes. Our study also revealed that CREB binding protein (CREBBP) and cardiotrophin-1 (CTF1) have suggestive roles in linking Type 2 diabetes and neuromuscular diseases.
Collapse
Affiliation(s)
- Deniz Rende
- Department of Materials Science and Engineering, Rensselaer Polytechnic Institute, Troy, New York, United States of America.
| | | | | |
Collapse
|
120
|
Zhu J, Qin Y, Liu T, Wang J, Zheng X. Prioritization of candidate disease genes by topological similarity between disease and protein diffusion profiles. BMC Bioinformatics 2013; 14 Suppl 5:S5. [PMID: 23734762 PMCID: PMC3622672 DOI: 10.1186/1471-2105-14-s5-s5] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identification of gene-phenotype relationships is a fundamental challenge in human health clinic. Based on the observation that genes causing the same or similar phenotypes tend to correlate with each other in the protein-protein interaction network, a lot of network-based approaches were proposed based on different underlying models. A recent comparative study showed that diffusion-based methods achieve the state-of-the-art predictive performance. RESULTS In this paper, a new diffusion-based method was proposed to prioritize candidate disease genes. Diffusion profile of a disease was defined as the stationary distribution of candidate genes given a random walk with restart where similarities between phenotypes are incorporated. Then, candidate disease genes are prioritized by comparing their diffusion profiles with that of the disease. Finally, the effectiveness of our method was demonstrated through the leave-one-out cross-validation against control genes from artificial linkage intervals and randomly chosen genes. Comparative study showed that our method achieves improved performance compared to some classical diffusion-based methods. To further illustrate our method, we used our algorithm to predict new causing genes of 16 multifactorial diseases including Prostate cancer and Alzheimer's disease, and the top predictions were in good consistent with literature reports. CONCLUSIONS Our study indicates that integration of multiple information sources, especially the phenotype similarity profile data, and introduction of global similarity measure between disease and gene diffusion profiles are helpful for prioritizing candidate disease genes. AVAILABILITY Programs and data are available upon request.
Collapse
Affiliation(s)
- Jie Zhu
- Department of Mathematics, Shanghai Normal University, Shanghai, China
| | | | | | | | | |
Collapse
|
121
|
Gurulingappa H, Mudi A, Toldo L, Hofmann-Apitius M, Bhate J. Challenges in mining the literature for chemical information. RSC Adv 2013. [DOI: 10.1039/c3ra40787j] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
|
122
|
Li Y, Li J. Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data. BMC Genomics 2012; 13 Suppl 7:S27. [PMID: 23282070 PMCID: PMC3521411 DOI: 10.1186/1471-2164-13-s7-s27] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND High throughput experiments resulted in many genomic datasets and hundreds of candidate disease genes. To discover the real disease genes from a set of candidate genes, computational methods have been proposed and worked on various types of genomic data sources. As a single source of genomic data is prone of bias, incompleteness and noise, integration of different genomic data sources is highly demanded to accomplish reliable disease gene identification. RESULTS In contrast to the commonly adapted data integration approach which integrates separate lists of candidate genes derived from the each single data sources, we merge various genomic networks into a multigraph which is capable of connecting multiple edges between a pair of nodes. This novel approach provides a data platform with strong noise tolerance to prioritize the disease genes. A new idea of random walk is then developed to work on multigraphs using a modified step to calculate the transition matrix. Our method is further enhanced to deal with heterogeneous data types by allowing cross-walk between phenotype and gene networks. Compared on benchmark datasets, our method is shown to be more accurate than the state-of-the-art methods in disease gene identification. We also conducted a case study to identify disease genes for Insulin-Dependent Diabetes Mellitus. Some of the newly identified disease genes are supported by recently published literature. CONCLUSIONS The proposed RWRM (Random Walk with Restart on Multigraphs) model and CHN (Complex Heterogeneous Network) model are effective in data integration for candidate gene prioritization.
Collapse
Affiliation(s)
- Yongjin Li
- Center for Systems Biology, University of Texas at Dallas, USA.
| | | |
Collapse
|
123
|
Abstract
Clustering textual contents is an important step in mining useful information on the web or other text-based resources. The common task in text clustering is to handle text in a multi-dimensional space, and to partition documents into groups, where each group contains documents that are similar to each other. However, this strategy lacks a comprehensive view for humans in general since it cannot explain the main subject of each cluster. Utilizing semantic information can solve this problem, but it needs a well-defined ontology or pre-labeled gold standard set. In this paper, we present a thematic clustering algorithm for text documents. Given text, subject terms are extracted and used for clustering documents in a probabilistic framework. An EM approach is used to ensure documents are assigned to correct subjects, hence it converges to a locally optimal solution. The proposed method is distinctive because its results are sufficiently explanatory for human understanding as well as efficient for clustering performance. The experimental results show that the proposed method provides a competitive performance compared to other state-of-the-art approaches. We also show that the extracted themes from the MEDLINE® dataset represent the subjects of clusters reasonably well.
Collapse
Affiliation(s)
- Sun Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | |
Collapse
|
124
|
González-Alcaide G, Castelló-Cogollos L, Castellano-Gómez M, Agullo-Calatayud V, Aleixandre-Benavent R, Alvarez FJ, Valderrama-Zurián JC. Scientific publications and research groups on alcohol consumption and related problems worldwide: authorship analysis of papers indexed in PubMed and Scopus databases (2005 to 2009). Alcohol Clin Exp Res 2012; 37 Suppl 1:E381-93. [PMID: 22974198 DOI: 10.1111/j.1530-0277.2012.01934.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 06/13/2012] [Indexed: 11/29/2022]
Abstract
BACKGROUND The research of alcohol consumption-related problems is a multidisciplinary field. The aim of this study is to analyze the worldwide scientific production in the area of alcohol-drinking and alcohol-related problems from 2005 to 2009. METHODS A MEDLINE and Scopus search on alcohol (alcohol-drinking and alcohol-related problems) published from 2005 to 2009 was carried out. Using bibliometric indicators, the distribution of the publications was determined within the journals that publish said articles, specialty of the journal (broad subject terms), article type, language of the publication, and country where the journal is published. Also, authorship characteristics were assessed (collaboration index and number of authors who have published more than 9 documents). The existing research groups were also determined. RESULTS About 24,100 documents on alcohol, published in 3,862 journals, and authored by 69,640 authors were retrieved from MEDLINE and Scopus between the years 2005 and 2009. The collaboration index of the articles was 4.83 ± 3.7. The number of consolidated research groups in the field was identified as 383, with 1,933 authors. Documents on alcohol were published mainly in journals covering the field of "Substance-Related Disorders," 23.18%, followed by "Medicine," 8.7%, "Psychiatry," 6.17%, and "Gastroenterology," 5.25%. CONCLUSIONS Research on alcohol is a consolidated field, with an average of 4,820 documents published each year between 2005 and 2009 in MEDLINE and Scopus. Alcohol-related publications have a marked multidisciplinary nature. Collaboration was common among alcohol researchers. There is an underrepresentation of alcohol-related publications in languages other than English and from developing countries, in MEDLINE and Scopus databases.
Collapse
|
125
|
Doncheva NT, Kacprowski T, Albrecht M. Recent approaches to the prioritization of candidate disease genes. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2012; 4:429-42. [PMID: 22689539 DOI: 10.1002/wsbm.1177] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Many efforts are still devoted to the discovery of genes involved with specific phenotypes, in particular, diseases. High-throughput techniques are thus applied frequently to detect dozens or even hundreds of candidate genes. However, the experimental validation of many candidates is often an expensive and time-consuming task. Therefore, a great variety of computational approaches has been developed to support the identification of the most promising candidates for follow-up studies. The biomedical knowledge already available about the disease of interest and related genes is commonly exploited to find new gene-disease associations and to prioritize candidates. In this review, we highlight recent methodological advances in this research field of candidate gene prioritization. We focus on approaches that use network information and integrate heterogeneous data sources. Furthermore, we discuss current benchmarking procedures for evaluating and comparing different prioritization methods.
Collapse
|
126
|
Teixeira RKC, Gonçalves TB, Yamaki VN, Botelho NM, Brito MVH. Evaluation of the key words used in articles of the Acta Cirurgica Brasileira from 1997 to 2012. Acta Cir Bras 2012; 27:350-4. [DOI: 10.1590/s0102-86502012000500012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 03/19/2012] [Indexed: 11/22/2022] Open
Abstract
PURPOSE: To evaluate the key words used in Acta Cirurgica Brasileira from 1997 to 2012. METHODS: All the key words of all articles published in regular issues between 1997 and 2012 were analyzed, ensuring that these key words were in the MeSH database (Medical Subjects Headings) and the most used subject headings and most wrong repeated key words were ranked. RESULTS: > 4230 key words used in 990 articles were analyzed. Only 579 key words (13.68%) were not in the MeSH database, considering that there was a statistically significant decrease over the years (p<0.001). The three most used key words were Rats, Dogs and Wound healing. Among the wrong ones, the key words were Adhesions, Experimental surgery and Anatomosis. CONCLUSION: There was a gradual improvement in the amount of key words used that belonged to the MeSH database, and there were 618 articles (62.42%) with all key words correct.
Collapse
|
127
|
Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J Biomed Inform 2012; 45:885-92. [PMID: 22554702 DOI: 10.1016/j.jbi.2012.04.008] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Revised: 03/08/2012] [Accepted: 04/11/2012] [Indexed: 02/06/2023]
Abstract
A significant amount of information about drug-related safety issues such as adverse effects are published in medical case reports that can only be explored by human readers due to their unstructured nature. The work presented here aims at generating a systematically annotated corpus that can support the development and validation of methods for the automatic extraction of drug-related adverse effects from medical case reports. The documents are systematically double annotated in various rounds to ensure consistent annotations. The annotated documents are finally harmonized to generate representative consensus annotations. In order to demonstrate an example use case scenario, the corpus was employed to train and validate models for the classification of informative against the non-informative sentences. A Maximum Entropy classifier trained with simple features and evaluated by 10-fold cross-validation resulted in the F₁ score of 0.70 indicating a potential useful application of the corpus.
Collapse
|
128
|
Abstract
Integrative Biology (IB) uses experimental or computational quantitative technologies to characterize biological systems at the molecular, cellular, tissue and population levels. IB typically involves the integration of the data, knowledge and capabilities across disciplinary boundaries in order to solve complex problems. We identify a series of bioinformatics problems posed by interdisciplinary integration: (i) data integration that interconnects structured data across related biomedical domains; (ii) ontology integration that brings jargons, terminologies and taxonomies from various disciplines into a unified network of ontologies; (iii) knowledge integration that integrates disparate knowledge elements from multiple sources; (iv) service integration that build applications out of services provided by different vendors. We argue that IB can benefit significantly from the integration solutions enabled by Semantic Web (SW) technologies. The SW enables scientists to share content beyond the boundaries of applications and websites, resulting into a web of data that is meaningful and understandable to any computers. In this review, we provide insight into how SW technologies can be used to build open, standardized and interoperable solutions for interdisciplinary integration on a global basis. We present a rich set of case studies in system biology, integrative neuroscience, bio-pharmaceutics and translational medicine, to highlight the technical features and benefits of SW applications in IB.
Collapse
Affiliation(s)
- Huajun Chen
- College of Computer Science, Zhejiang University, Hangzhou, 310027, P.R. China.
| | | | | |
Collapse
|
129
|
Søgaard M, Andersen JP, Schønheyder HC. Searching PubMed for studies on bacteremia, bloodstream infection, septicemia, or whatever the best term is: a note of caution. Am J Infect Control 2012; 40:237-40. [PMID: 21775021 DOI: 10.1016/j.ajic.2011.03.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Accepted: 03/08/2011] [Indexed: 01/23/2023]
Abstract
BACKGROUND There is inconsistency in the terminology used to describe bacteremia. To demonstrate the impact on information retrieval, we compared the yield of articles from PubMed MEDLINE using the terms "bacteremia," "bloodstream infection," and "septicemia." METHODS We searched for articles published between 1966 and 2009, and depicted the relationships among queries graphically. To examine the content of the retrieved articles, we extracted all Medical Subject Headings (MeSH) terms and compared topic similarity using a cosine measure. RESULTS The recovered articles differed greatly by term, and only 53 articles were captured by all terms. Of the articles retrieved by the "bacteremia" query, 21,438 (84.1%) were not captured when searching for "bloodstream infection" or "septicemia." Likewise, only 2,243 of the 11,796 articles recovered by free-text query for "bloodstream infection" were retrieved by the "bacteremia" query (19%). Entering "bloodstream infection" as a phrase, 46.1% of the records overlapped with the "bacteremia" query. Similarity measures ranged from 0.52 to 0.78 and were lowest for "bloodstream infection" as a phrase compared with "septicemia." CONCLUSION Inconsistent terminology has a major impact on the yield of queries. Agreement on terminology should be sought and promoted by scientific journals. An immediate solution is to add "bloodstream infection" as entry term for bacteremia in the MeSH vocabulary.
Collapse
Affiliation(s)
- Mette Søgaard
- Department of Clinical Microbiology, Aalborg Hospital, Aarhus University Hospital, Denmark.
| | | | | |
Collapse
|
130
|
|
131
|
Abstract
BACKGROUND Identifying protein-protein interactions (PPIs) from literature is an important step in mining the function of individual proteins as well as their biological network. Since it is known that PPIs have distinctive patterns in text, machine learning approaches have been successfully applied to mine these patterns. However, the complex nature of PPI description makes the extraction process difficult. RESULTS Our approach utilizes both word and syntactic features to effectively capture PPI patterns from biomedical literature. The proposed method automatically identifies gene names by a Priority Model, then extracts grammar relations using a dependency parser. A large margin classifier with Huber loss function learns from the extracted features, and unknown articles are predicted using this data-driven model. For the BioCreative III ACT evaluation, our official runs were ranked in top positions by obtaining maximum 89.15% accuracy, 61.42% F1 score, 0.55306 MCC score, and 67.98% AUC iP/R score. CONCLUSIONS Even though problems still remain, utilizing syntactic information for article-level filtering helps improve PPI ranking performance. The proposed system is a revision of previously developed algorithms in our group for the ACT evaluation. Our approach is valuable in showing how to use grammatical relations for PPI article filtering, in particular, with a limited training corpus. While current performance is far from satisfactory as an annotation tool, it is already useful for a PPI article search engine since users are mainly focused on highly-ranked results.
Collapse
Affiliation(s)
- Sun Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - W John Wilbur
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
132
|
Hu H, Correll M, Kvecher L, Osmond M, Clark J, Bekhash A, Schwab G, Gao D, Gao J, Kubatin V, Shriver CD, Hooke JA, Maxwell LG, Kovatich AJ, Sheldon JG, Liebman MN, Mural RJ. DW4TR: A Data Warehouse for Translational Research. J Biomed Inform 2011; 44:1004-19. [PMID: 21872681 DOI: 10.1016/j.jbi.2011.08.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 07/05/2011] [Accepted: 08/04/2011] [Indexed: 10/17/2022]
Abstract
The linkage between the clinical and laboratory research domains is a key issue in translational research. Integration of clinicopathologic data alone is a major task given the number of data elements involved. For a translational research environment, it is critical to make these data usable at the point-of-need. Individual systems have been developed to meet the needs of particular projects though the need for a generalizable system has been recognized. Increased use of Electronic Medical Record data in translational research will demand generalizing the system for integrating clinical data to support the study of a broad range of human diseases. To ultimately satisfy these needs, we have developed a system to support multiple translational research projects. This system, the Data Warehouse for Translational Research (DW4TR), is based on a light-weight, patient-centric modularly-structured clinical data model and a specimen-centric molecular data model. The temporal relationships of the data are also part of the model. The data are accessed through an interface composed of an Aggregated Biomedical-Information Browser (ABB) and an Individual Subject Information Viewer (ISIV) which target general users. The system was developed to support a breast cancer translational research program and has been extended to support a gynecological disease program. Further extensions of the DW4TR are underway. We believe that the DW4TR will play an important role in translational research across multiple disease types.
Collapse
Affiliation(s)
- Hai Hu
- Windber Research Institute, Windber, PA 15963, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
133
|
Doderer MS, Burkhardt C, Robbins KA. SIDECACHE: Information access, management and dissemination framework for web services. BMC Res Notes 2011; 4:182. [PMID: 21672219 PMCID: PMC3132714 DOI: 10.1186/1756-0500-4-182] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2011] [Accepted: 06/14/2011] [Indexed: 11/16/2022] Open
Abstract
Background Many bioinformatics algorithms and data sets are deployed using web services so that the results can be explored via the Internet and easily integrated into other tools and services. These services often include data from other sites that is accessed either dynamically or through file downloads. Developers of these services face several problems because of the dynamic nature of the information from the upstream services. Many publicly available repositories of bioinformatics data frequently update their information. When such an update occurs, the developers of the downstream service may also need to update. For file downloads, this process is typically performed manually followed by web service restart. Requests for information obtained by dynamic access of upstream sources is sometimes subject to rate restrictions. Findings SideCache provides a framework for deploying web services that integrate information extracted from other databases and from web sources that are periodically updated. This situation occurs frequently in biotechnology where new information is being continuously generated and the latest information is important. SideCache provides several types of services including proxy access and rate control, local caching, and automatic web service updating. Conclusions We have used the SideCache framework to automate the deployment and updating of a number of bioinformatics web services and tools that extract information from remote primary sources such as NCBI, NCIBI, and Ensembl. The SideCache framework also has been used to share research results through the use of a SideCache derived web service.
Collapse
Affiliation(s)
- Mark S Doderer
- Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA.
| | | | | |
Collapse
|
134
|
von der Lieth CW, Freire AA, Blank D, Campbell MP, Ceroni A, Damerell DR, Dell A, Dwek RA, Ernst B, Fogh R, Frank M, Geyer H, Geyer R, Harrison MJ, Henrick K, Herget S, Hull WE, Ionides J, Joshi HJ, Kamerling JP, Leeflang BR, Lütteke T, Lundborg M, Maass K, Merry A, Ranzinger R, Rosen J, Royle L, Rudd PM, Schloissnig S, Stenutz R, Vranken WF, Widmalm G, Haslam SM. EUROCarbDB: An open-access platform for glycoinformatics. Glycobiology 2011; 21:493-502. [PMID: 21106561 PMCID: PMC3055595 DOI: 10.1093/glycob/cwq188] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2010] [Revised: 11/03/2010] [Accepted: 11/03/2010] [Indexed: 01/03/2023] Open
Abstract
The EUROCarbDB project is a design study for a technical framework, which provides sophisticated, freely accessible, open-source informatics tools and databases to support glycobiology and glycomic research. EUROCarbDB is a relational database containing glycan structures, their biological context and, when available, primary and interpreted analytical data from high-performance liquid chromatography, mass spectrometry and nuclear magnetic resonance experiments. Database content can be accessed via a web-based user interface. The database is complemented by a suite of glycoinformatics tools, specifically designed to assist the elucidation and submission of glycan structure and experimental data when used in conjunction with contemporary carbohydrate research workflows. All software tools and source code are licensed under the terms of the Lesser General Public License, and publicly contributed structures and data are freely accessible. The public test version of the web interface to the EUROCarbDB can be found at http://www.ebi.ac.uk/eurocarb.
Collapse
Affiliation(s)
| | - Ana Ardá Freire
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Dennis Blank
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | - Matthew P Campbell
- Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Dublin, Ireland
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Alessio Ceroni
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - David R Damerell
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Anne Dell
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Raymond A Dwek
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Beat Ernst
- Department of Pharmaceutical Science, University of Basel, BaselSwitzerland
| | - Rasmus Fogh
- European Bioinformatics Institute, Hinxton, UK
| | - Martin Frank
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - Hildegard Geyer
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | - Rudolf Geyer
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | | | - Kim Henrick
- European Bioinformatics Institute, Hinxton, UK
| | - Stefan Herget
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - William E Hull
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | | | - Hiren J Joshi
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
- European Bioinformatics Institute, Hinxton, UK
| | - Johannis P Kamerling
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Bas R Leeflang
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Thomas Lütteke
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | | | - Kai Maass
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | | | - René Ranzinger
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - Jimmy Rosen
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Louise Royle
- Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Dublin, Ireland
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Pauline M Rudd
- Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Dublin, Ireland
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Siegfried Schloissnig
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - Roland Stenutz
- Organic Chemistry, Stockholm University, Stockholm, Sweden
| | | | - Göran Widmalm
- Organic Chemistry, Stockholm University, Stockholm, Sweden
| | - Stuart M Haslam
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| |
Collapse
|
135
|
Schlicker A, Lengauer T, Albrecht M. Improving disease gene prioritization using the semantic similarity of Gene Ontology terms. ACTA ACUST UNITED AC 2010; 26:i561-7. [PMID: 20823322 PMCID: PMC2935448 DOI: 10.1093/bioinformatics/btq384] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
MOTIVATION Many hereditary human diseases are polygenic, resulting from sequence alterations in multiple genes. Genomic linkage and association studies are commonly performed for identifying disease-related genes. Such studies often yield lists of up to several hundred candidate genes, which have to be prioritized and validated further. Recent studies discovered that genes involved in phenotypically similar diseases are often functionally related on the molecular level. RESULTS Here, we introduce MedSim, a novel approach for ranking candidate genes for a particular disease based on functional comparisons involving the Gene Ontology. MedSim uses functional annotations of known disease genes for assessing the similarity of diseases as well as the disease relevance of candidate genes. We benchmarked our approach with genes known to be involved in 99 diseases taken from the OMIM database. Using artificial quantitative trait loci, MedSim achieved excellent performance with an area under the ROC curve of up to 0.90 and a sensitivity of over 70% at 90% specificity when classifying gene products according to their disease relatedness. This performance is comparable or even superior to related methods in the field, albeit using less and thus more easily accessible information. AVAILABILITY MedSim is offered as part of our FunSimMat web service (http://www.funsimmat.de).
Collapse
Affiliation(s)
- Andreas Schlicker
- Max Planck Institute for Informatics, Department of Computational Biology and Applied Algorithmics, Saarbrücken, Germany
| | | | | |
Collapse
|
136
|
Jelercic S, Lingard H, Spiegel W, Pichlhöfer O, Maier M. Assessment of publication output in the field of general practice and family medicine and by general practitioners and general practice institutions. Fam Pract 2010; 27:582-9. [PMID: 20554654 DOI: 10.1093/fampra/cmq032] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
PURPOSE The discipline of family medicine (FM) lacks a comprehensive methodology, which can be applied as a standard for assessing overall research output in both the field of FM and by general practitioners (GPs)/general practice institutions. It was the aim of this study to develop a sensitive search strategy for assessing publication output in the field of FM independent of the author's profession or affiliation and by GPs/general practice institutions independent of their field of scientific interest. METHODS Literature searches limited to the year 2005 were conducted in PubMed and ISI Web of Sciences (ISI WoS). In PubMed, all relevant MeSH terms were used. Search terms possibly contained in the author's affiliations have been collected. In ISI WoS, the same entry terms including their abbreviations and plural forms were applied. The final queries were validated by manual review and matching results with selected FM journals. RESULTS A comprehensive list of combined search terms could be defined. For the field of general practice/FM more publications could be retrieved in PubMed. Almost twice as many publications by GPs/general practice institutions could be retrieved in ISI WoS, where--in contrast to PubMed--the affiliation is documented for all authors. CONCLUSIONS To quantitatively assess publication output in the field of FM, PubMed was identified as the preferable database. To assess publication output by GPs/general practice institutions, the ISI WoS is recommended as the preferable database. Apparently, the ISI WoS is more suitable to compare the research productivity of different countries, authors or institutions.
Collapse
|
137
|
Handcock J, Deutsch EW, Boyle J. mspecLINE: bridging knowledge of human disease with the proteome. BMC Med Genomics 2010; 3:7. [PMID: 20219133 PMCID: PMC2845087 DOI: 10.1186/1755-8794-3-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2009] [Accepted: 03/10/2010] [Indexed: 02/07/2023] Open
Abstract
Background Public proteomics databases such as PeptideAtlas contain peptides and proteins identified in mass spectrometry experiments. However, these databases lack information about human disease for researchers studying disease-related proteins. We have developed mspecLINE, a tool that combines knowledge about human disease in MEDLINE with empirical data about the detectable human proteome in PeptideAtlas. mspecLINE associates diseases with proteins by calculating the semantic distance between annotated terms from a controlled biomedical vocabulary. We used an established semantic distance measure that is based on the co-occurrence of disease and protein terms in the MEDLINE bibliographic database. Results The mspecLINE web application allows researchers to explore relationships between human diseases and parts of the proteome that are detectable using a mass spectrometer. Given a disease, the tool will display proteins and peptides from PeptideAtlas that may be associated with the disease. It will also display relevant literature from MEDLINE. Furthermore, mspecLINE allows researchers to select proteotypic peptides for specific protein targets in a mass spectrometry assay. Conclusions Although mspecLINE applies an information retrieval technique to the MEDLINE database, it is distinct from previous MEDLINE query tools in that it combines the knowledge expressed in scientific literature with empirical proteomics data. The tool provides valuable information about candidate protein targets to researchers studying human disease and is freely available on a public web server.
Collapse
Affiliation(s)
- Jeremy Handcock
- Institute for Systems Biology, 1441 N 34th St, Seattle, WA 98103, USA
| | | | | |
Collapse
|
138
|
Parry M, Watt-Watson J. Peer Support Intervention Trials for Individuals with Heart Disease: A Systematic Review. Eur J Cardiovasc Nurs 2010; 9:57-67. [PMID: 19926339 DOI: 10.1016/j.ejcnurse.2009.10.002] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/09/2009] [Revised: 10/16/2009] [Accepted: 10/24/2009] [Indexed: 11/16/2022]
Affiliation(s)
- Monica Parry
- Lawrence S. Bloomberg Faculty of Nursing, University of Toronto, Canada
| | - Judy Watt-Watson
- Lawrence S. Bloomberg Faculty of Nursing, University of Toronto, Canada
| |
Collapse
|
139
|
Yu Y, Tu K, Zheng S, Li Y, Ding G, Ping J, Hao P, Li Y. GEOGLE: context mining tool for the correlation between gene expression and the phenotypic distinction. BMC Bioinformatics 2009; 10:264. [PMID: 19703314 PMCID: PMC2745391 DOI: 10.1186/1471-2105-10-264] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2009] [Accepted: 08/25/2009] [Indexed: 12/05/2022] Open
Abstract
Background In the post-genomic era, the development of high-throughput gene expression detection technology provides huge amounts of experimental data, which challenges the traditional pipelines for data processing and analyzing in scientific researches. Results In our work, we integrated gene expression information from Gene Expression Omnibus (GEO), biomedical ontology from Medical Subject Headings (MeSH) and signaling pathway knowledge from sigPathway entries to develop a context mining tool for gene expression analysis – GEOGLE. GEOGLE offers a rapid and convenient way for searching relevant experimental datasets, pathways and biological terms according to multiple types of queries: including biomedical vocabularies, GDS IDs, gene IDs, pathway names and signature list. Moreover, GEOGLE summarizes the signature genes from a subset of GDSes and estimates the correlation between gene expression and the phenotypic distinction with an integrated p value. Conclusion This approach performing global searching of expression data may expand the traditional way of collecting heterogeneous gene expression experiment data. GEOGLE is a novel tool that provides researchers a quantitative way to understand the correlation between gene expression and phenotypic distinction through meta-analysis of gene expression datasets from different experiments, as well as the biological meaning behind. The web site and user guide of GEOGLE are available at:
Collapse
Affiliation(s)
- Yao Yu
- Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, PR China.
| | | | | | | | | | | | | | | |
Collapse
|
140
|
Carroll LJ, Cassidy JD, Peloso PM, Giles-Smith L, Cheng CS, Greenhalgh SW, Haldeman S, van der Velde G, Hurwitz EL, Côté P, Nordin M, Hogg-Johnson S, Holm LW, Guzman J, Carragee EJ. Methods for the Best Evidence Synthesis on Neck Pain and Its Associated Disorders. J Manipulative Physiol Ther 2009; 32:S39-45. [DOI: 10.1016/j.jmpt.2008.11.009] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
141
|
Abstract
The multilingual search engine ARRS GoldMiner Global was created to facilitate broad international access to a richly indexed collection of more than 200,000 radiologic images. Images are indexed according to key-words and medical concepts that appear in the unstructured text of their English-language image captions. GoldMiner Global exploits the Unicode standard, which allows the accurate representation of characters and ideographs from virtually any language and which supports both left-to-right and right-to-left text directions. The user interface supports queries in Arabic, Chinese, French, German, Italian, Japanese, Korean, Portuguese, Russian, or Spanish. GoldMiner Global incorporates an interface to the United States National Library of Medicine that translates queries into English-language Medical Subject Headings (MeSH) terms. The translated MeSH terms are then used to search the image index and retrieve relevant images. Explanatory text, pull-down menu choices, and navigational guides are displayed in the selected language; search results are displayed in English. GoldMiner Global is freely available on the World Wide Web.
Collapse
Affiliation(s)
- Charles E Kahn
- Department of Radiology, Medical College of Wisconsin, 9200 W Wisconsin Ave, Milwaukee, WI 53226, USA.
| |
Collapse
|
142
|
|
143
|
A primer on selected aspects of evidence-based practice relating to questions of treatment. Part 1: asking questions, finding evidence, and determining validity. J Orthop Sports Phys Ther 2008; 38:476-84. [PMID: 18678960 DOI: 10.2519/jospt.2008.2722] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The process of evidence-based practice (EBP) guides clinicians in the integration of individual clinical expertise, patient values and expectations, and the best available evidence. Becoming proficient with this process takes time and consistent practice, but should ultimately lead to improved patient outcomes. The EBP process entails 5 steps: (1) formulating an appropriate question, (2) performing an efficient literature search, (3) critically appraising the best available evidence, (4) applying the best evidence to clinical practice, and (5) assessing outcomes of care. This first commentary in a 2-part series will review principles relating to steps 1, 2, and 3 of this 5-step model. The purpose of this commentary is to provide a perspective to assist clinicians in formulating foreground questions, searching for the best available evidence, and determining validity of results in studies of interventions for orthopaedic and sports physical therapy.
Collapse
|
144
|
Jiang X, Liu B, Jiang J, Zhao H, Fan M, Zhang J, Fan Z, Jiang T. Modularity in the genetic disease-phenotype network. FEBS Lett 2008; 582:2549-54. [PMID: 18582463 DOI: 10.1016/j.febslet.2008.06.023] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Revised: 05/23/2008] [Accepted: 06/13/2008] [Indexed: 11/16/2022]
Abstract
Similar disease phenotypes are engendered as a result of the modular nature of gene networks; thus we hypothesized that all human genetic disease phenotypes appear in similar modular styles. Network representations of phenotypes make it possible to explore this hypothesis. We investigated the modularity of a network of genetic disease phenotypes. We computationally extracted phenotype modules and found that the modularity is well correlated with a physiological classification of human diseases. We also found correlations between the modularity and functional genomics as well as its connection to drug-target associations.
Collapse
Affiliation(s)
- Xingpeng Jiang
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, PR China
| | | | | | | | | | | | | | | |
Collapse
|
145
|
Network-based global inference of human disease genes. Mol Syst Biol 2008; 4:189. [PMID: 18463613 PMCID: PMC2424293 DOI: 10.1038/msb.2008.27] [Citation(s) in RCA: 455] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2007] [Accepted: 03/17/2008] [Indexed: 01/04/2023] Open
Abstract
Deciphering the genetic basis of human diseases is an important goal of biomedical research. On the basis of the assumption that phenotypically similar diseases are caused by functionally related genes, we propose a computational framework that integrates human protein–protein interactions, disease phenotype similarities, and known gene–phenotype associations to capture the complex relationships between phenotypes and genotypes. We develop a tool named CIPHER to predict and prioritize disease genes, and we show that the global concordance between the human protein network and the phenotype network reliably predicts disease genes. Our method is applicable to genetically uncharacterized phenotypes, effective in the genome-wide scan of disease genes, and also extendable to explore gene cooperativity in complex diseases. The predicted genetic landscape of over 1000 human phenotypes, which reveals the global modular organization of phenotype–genotype relationships. The genome-wide prioritization of candidate genes for over 5000 human phenotypes, including those with under-characterized disease loci or even those lacking known association, is publicly released to facilitate future discovery of disease genes.
Collapse
|
146
|
Methods for the best evidence synthesis on neck pain and its associated disorders: the Bone and Joint Decade 2000-2010 Task Force on Neck Pain and Its Associated Disorders. Spine (Phila Pa 1976) 2008; 33:S33-8. [PMID: 18204397 DOI: 10.1097/brs.0b013e3181644b06] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
STUDY DESIGN Best evidence synthesis. OBJECTIVE To provide a detailed description of the methods undertaken in a systematic search and perform a best evidence synthesis on the frequency, determinants, assessment, interventions, course and prognosis of neck pain, and its associated disorders. SUMMARY OF BACKGROUND DATA Neck pain is an important cause of health burden; however, the published information is vast, and stakeholders would benefit from a summary of the best evidence. METHODS The Bone and Joint Decade 2000-2010 Task Force on Neck Pain and its Associated Disorders conducted a systematic search and critical review of the literature published between 1980 and 2006 to assemble the best evidence on neck pain. Citations were screened for relevance to the Neck Pain Task Force mandate, using a priori criteria, and relevant studies were critically reviewed for their internal scientific validity. Findings from studies meeting criteria for scientific validity were synthesized into a best evidence synthesis. RESULTS We found 31,878 citations, of which 1203 were relevant to the mandate of the Neck Pain Task Force. After critical review, 552 studies (46%) were judged scientifically admissible and were compiled into the best evidence synthesis. CONCLUSION The Bone and Joint Decade 2000-2010 Task Force on Neck Pain and its Associated Disorders undertook a best evidence synthesis to establish a baseline of the current best evidence on the epidemiology, assessment and classification of neck pain, as well as interventions and prognosis for this symptom. This article reports the methods used and the outcomes from the review. We found that 46% of the research literature was of acceptable scientific quality to inform clinical practice, policy-making, and future research.
Collapse
|
147
|
[Bibliographic medical research using MEDLINE-PubMed. A practical approach based on examples]. Nephrol Ther 2007; 3:475-85. [PMID: 18048003 DOI: 10.1016/j.nephro.2007.06.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2007] [Accepted: 06/28/2007] [Indexed: 11/21/2022]
|
148
|
Stanojevic S, Wade A, Lum S, Stocks J. Reference equations for pulmonary function tests in preschool children: a review. Pediatr Pulmonol 2007; 42:962-72. [PMID: 17726704 DOI: 10.1002/ppul.20691] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Recent developments in pulmonary function tests (PFTs) in preschool children (2-5 years of age) have meant that objective assessments of respiratory function are now possible for this age group. However, the application and interpretation of these tests may be limited by the relative paucity of appropriate reference equations. This review summarizes available preschool reference equations, identifies the current gaps and limitations in the methodologies and statistics used and proposes future directions for improving reference data. A PubMed search which included the MeSH terms (preschool [2-5years]), (respiratory function test), and (reference value) yielded 214 publications which were screened to identify 34 publications presenting 36 reference equations for seven techniques. There were considerable differences with respect to population characteristics, recruitment strategies, equipment and methodologies and reported parameters both within and between each measurement technique. Despite an increasing number of reference equations for PFT for preschool children, the extent to which these can be generalized to other populations may be limited in some cases by inclusion of relatively few children less than 5 years of age, a lack of details regarding the sample populations and measurement techniques and/or inappropriate statistical analysis. A fresh approach based on large sample sizes, clearly documented population characteristics, equipment and protocols, and more rigorous modern statistical methods both for developing reference equations and interpreting results could enhance clinical application of these tests. This in turn would maximize the tremendous opportunities to detect early lung disease offered by the recent surge in developing suitable tests for preschool children.
Collapse
Affiliation(s)
- Sanja Stanojevic
- Portex Respiratory Physiology Unit, UCL, Institute of Child Health, London, United Kingdom.
| | | | | | | |
Collapse
|
149
|
Burkart MF, Wren JD, Herschkowitz JI, Perou CM, Garner HR. Clustering microarray-derived gene lists through implicit literature relationships. Bioinformatics 2007; 23:1995-2003. [PMID: 17537751 DOI: 10.1093/bioinformatics/btm261] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Microarrays rapidly generate large quantities of gene expression information, but interpreting such data within a biological context is still relatively complex and laborious. New methods that can identify functionally related genes via shared literature concepts will be useful in addressing these needs. RESULTS We have developed a novel method that uses implicit literature relationships (concepts related via shared, intermediate concepts) to cluster related genes. Genes are evaluated for implicit connections within a network of biomedical objects (other genes, ontological concepts and diseases) that are connected via their co-occurrences in Medline titles and/or abstracts. On the basis of these implicit relationships, individual gene pairs are scored using a probability-based algorithm. Scores are generated for all pairwise combinations of genes, which are then clustered based on the scores. We applied this method to a test set composed of nine functional groups with known relationships. The method scored highly for all nine groups and significantly better than a benchmark co-occurrence-based method for six groups. We then applied this method to gene sets specific to two previously defined breast tumor subtypes. Analysis of the results recapitulated known biological relationships and identified novel pathway relationships unique to each tumor subtype. We demonstrate that this method provides a valuable new means of identifying and visualizing significantly related genes within gene lists via their implicit relationships in the literature.
Collapse
Affiliation(s)
- Mark F Burkart
- Department of Internal Medicine, The McDermott Center for Human Growth and Development, Division of Translational Research, The University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA.
| | | | | | | | | |
Collapse
|
150
|
|