1
|
Pilot study of comparability of a smartphone blood pressure monitoring algorithm to conventional cuff-based blood pressure measurements. Eur Heart J 2021. [DOI: 10.1093/eurheartj/ehab724.2350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Abstract
Background/Introduction
There is a growing market for smartphone applications (apps), offering medical assessments such as blood pressure measurements (BPM). These apps have the potential to improve blood pressure (BP) control by making BPM broadly and easily accessible. Yet, to be suitable for clinical and diagnostic purposes, BPM measured with smartphone apps need to be comparable to conventional BPM.
Purpose
We sought to compare a novel photoplethysmographic BPM algorithm used in a smartphone app to conventional cuff-based BPM.
Methods
We included consecutive patients with an indication for ambulatory BPM. Office blood pressure measurements (OBPM) were taken with an oscillometric cuff-based device (Welch Allyn SureBP). The algorithm of the smartphone app detects the pulse wave in finger capillaries using the phone's camera and estimates BP based on the form of the pulse wave. Before estimating a BP value, the algorithm performs a quality assessment to automatically reject recordings with insufficient quality. On the first day (D1), we took 6 OBPM alternating with 5 smartphone BPM (TestBP). On the second day (D2), 4 OBPM alternating with 3 TestBP were measured. TestBP calibrated based on the first OBPM of D1. Each TestBP was then compared to its RefBP (defined as mean of the previous and following OBPM).
Results
50 patients were included in the study, resulting in 50 TestBP values on D1 and 33 on D2. There was no difference at the 5% significance level between the TestBP and RefBP distributions on both days, and for both systolic and diastolic pressures. The mean ± standard deviation (SD) of the differences between TestBP and RefBP was 0.7±9.4 / 1.0±4.5 mmHg on D1 and 2.6±8.2 / 1.3±4.1 mmHg on D2 for systolic/diastolic values, respectively. The number of TestBP measurements within 5, 10 and 15 mmHg from RefBP are shown in Table 1. Bland-Altman plots depicting the agreement between TestBP and RefBP are shown in Figure 1.
Conclusion
This smartphone algorithm shows comparable values to oscillometric cuff-based especially diastolic values. Its differences between TestBP – RefBP have a good stability 1 day after calibration. Before clinical use, this algorithm needs to undergo formal validation against a reference BP method accepted by international standards (auscultatory or invasive methods).
Funding Acknowledgement
Type of funding sources: Private company. Main funding source(s): Centre Suisse d'Electronique et de Microtechnique Table 1Figure 1
Collapse
|
2
|
The socket-shield technique: a critical literature review. Int J Implant Dent 2020; 6:52. [PMID: 32893327 PMCID: PMC7475165 DOI: 10.1186/s40729-020-00246-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 07/29/2020] [Indexed: 12/11/2022] Open
Abstract
Introduction Dental implants have become a standard treatment in the replacement of missing teeth. After tooth extraction and implant placement, resorption of buccal bundle bone can pose a significant complication with often very negative cosmetic impacts. Studies have shown that if the dental root remains in the alveolar process, bundle bone resorption is very minimal. However, to date, the deliberate retention of roots to preserve bone has not been routinely used in dental implantology. Material and methods This study aims to collect and evaluate the present knowledge with regard to the socket-shield technique as described by Hurzeler et al. (J Clin Periodontol 37(9):855-62, 2010). A PubMed database search (www.ncbi.nlm.nih.gov/pubmed) was conducted to identify relevant publication. Results The initial database search returned 229 results. After screening the abstracts, 13 articles were downloaded and further scrutinised. Twelve studies were found to meet the inclusion and exclusion criteria. Conclusion Whilst the socket-shield technique potentially offers promising outcomes, reducing the need for invasive bone grafts around implants in the aesthetic zone, clinical data to support this is very limited. The limited data available is compromised by a lack of well-designed prospective randomised controlled studies. The existing case reports are of very limited scientific value. Retrospective studies exist in limited numbers but are of inconsistent design. At this stage, it is unclear whether the socket-shield technique will provide a stable long-time outcome.
Collapse
|
3
|
|
4
|
Abstract
Different programs of The European Science Foundation (ESF) have contributed significantly to connect researchers in Europe and beyond through several initiatives. This support was particularly relevant for the development of the areas related with extracting information from papers (text-mining) because it supported the field in its early phases long before it was recognized by the community. We review the historical development of text mining research and how it was introduced in bioinformatics. Specific applications in (functional) genomics are described like it's integration in genome annotation pipelines and the support to the analysis of high-throughput genomics experimental data, and we highlight the activities of evaluation of methods and benchmarking for which the ESF programme support was instrumental.
Collapse
|
5
|
Osmoregulation in Lilium pollen grains occurs via modulation of the plasma membrane H+ ATPase activity by 14-3-3 proteins. PLANT PHYSIOLOGY 2010; 154:1921-8. [PMID: 20974894 PMCID: PMC2996032 DOI: 10.1104/pp.110.165696] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2010] [Accepted: 10/24/2010] [Indexed: 05/18/2023]
Abstract
To allow successful germination and growth of a pollen tube, mature and dehydrated pollen grains (PGs) take up water and have to adjust their turgor pressure according to the water potential of the surrounding stigma surface. The turgor pressure of PGs of lily (Lilium longiflorum) was measured with a modified pressure probe for simultaneous recordings of turgor pressure and membrane potential to investigate the relation between water and electrogenic ion transport in osmoregulation. Upon hyperosmolar shock, the turgor pressure decreased, and the plasma membrane (PM) hyperpolarizes in parallel, whereas depolarization of the PM was observed with hypoosmolar treatment. An acidification and alkalinization of the external medium was monitored after hyper- and hypoosmotic treatments, respectively, and pH changes were blocked by vanadate, indicating a putative role of the PM H(+) ATPase. Indeed, an increase in PM-associated 14-3-3 proteins and an increase in PM H(+) ATPase activity were detected in PGs challenged by hyperosmolar medium. We therefore suggest that in PGs the PM H(+) ATPase via modulation of its activity by 14-3-3 proteins is involved in the regulation of turgor pressure.
Collapse
|
6
|
Critical assessment of information extraction systems in biology. Comp Funct Genomics 2010; 4:674-7. [PMID: 18629031 PMCID: PMC2447314 DOI: 10.1002/cfg.337] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2003] [Revised: 09/24/2003] [Accepted: 09/25/2003] [Indexed: 11/10/2022] Open
Abstract
An increasing number of groups are now working in the area of text mining, focusing on a wide range of problems and applying both statistical and linguistic approaches. However, it is not possible to compare the different approaches, because there are no common standards or evaluation criteria; in addition, the various groups are addressing different problems, often using private datasets. As a result, it is impossible to determine how well the existing systems perform, and particularly what performance level can be expected in real applications. This is similar to the situation in text processing in the late 1980s, prior to the Message Understanding Conferences (MUCs). With the introduction of a common evaluation and standardized evaluation metrics as part of these conferences, it became possible to compare approaches, to identify those techniques that did or did not work and to make progress. This progress has resulted in a common pipeline of processes and a set of shared tools available to the general research community. The field of biology is ripe for a similar experiment. Inspired by this example, the BioLINK group (Biological Literature, Information and Knowledge [1]) is organizing a CASP-like evaluation for the text data-mining community applied to biology. The two main tasks specifically address two major bottlenecks for text mining in biology: (1) the correct detection of gene and protein names in text; and (2) the extraction of functional information related to proteins based on the GO classification system. For further information and participation details, see http://www.pdg.cnb.uam.es/BioLink/BioCreative.eval.html.
Collapse
|
7
|
Automatic classification of protein functions from the literature. Comp Funct Genomics 2010; 4:75-9. [PMID: 18629110 PMCID: PMC2447397 DOI: 10.1002/cfg.241] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2002] [Accepted: 11/29/2002] [Indexed: 11/18/2022] Open
|
8
|
Can bibliographic pointers for known biological data be found automatically? Protein interactions as a case study. Comp Funct Genomics 2010; 2:196-206. [PMID: 18628915 PMCID: PMC2447212 DOI: 10.1002/cfg.91] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2001] [Accepted: 06/25/2001] [Indexed: 11/11/2022] Open
Abstract
The Dictionary of Interacting Proteins (DIP) (Xenarios et al., 2000) is a large repository
of protein interactions: its March 2000 release included 2379 protein pairs whose
interactions have been detected by experimental methods. Even if many of these
correspond to poorly characterized proteins, the result of massive yeast two-hybrid
screenings, as many as 851 correspond to interactions detected using direct biochemical
methods. We used information retrieval technology to search automatically for sentences in
Medline abstracts that support these 851 DIP interactions. Surprisingly, we found
correspondence between DIP protein pairs and Medline sentences describing their
interactions in only 30% of the cases. This low coverage has interesting consequences
regarding the quality of annotations (references) introduced in the database and the
limitations of the application of information extraction (IE) technology to Molecular
Biology. It is clear that the limitation of analyzing abstracts rather than full papers and the
lack of standard protein names are difficulties of considerably more importance than the
limitations of the IE methodology employed. A positive finding is the capacity of the IE
system to identify new relations between proteins, even in a set of proteins previously
characterized by human experts. These identifications are made with a considerable degree
of precision. This is, to our knowledge, the first large scale assessment of IE capacity to detect
previously known interactions: we thus propose the use of the DIP data set as a biological
reference to benchmark IE systems.
Collapse
|
9
|
Extracting information automatically from biological literature. Comp Funct Genomics 2010; 2:310-3. [PMID: 18629239 PMCID: PMC2448400 DOI: 10.1002/cfg.102] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2001] [Accepted: 07/27/2001] [Indexed: 11/13/2022] Open
|
10
|
Osmoregulation in lily pollen involves the plasma membrane H+ ATPase. Comp Biochem Physiol A Mol Integr Physiol 2009. [DOI: 10.1016/j.cbpa.2009.04.419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Abstract
This article collects opinions from leading scientists about how text mining can provide better access to the biological literature, how the scientific community can help with this process, what the next steps are, and what role future BioCreative evaluations can play. The responses identify several broad themes, including the possibility of fusing literature and biological databases through text mining; the need for user interfaces tailored to different classes of users and supporting community-based annotation; the importance of scaling text mining technology and inserting it into larger workflows; and suggestions for additional challenge evaluations, new applications, and additional resources needed to make progress.
Collapse
|
12
|
Abstract
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.
Collapse
|
13
|
ISMB 2003 Text Mining SIG Meeting Report. Comp Funct Genomics 2008; 4:667-73. [PMID: 18629019 PMCID: PMC2447301 DOI: 10.1002/cfg.338] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2003] [Revised: 09/24/2003] [Accepted: 09/25/2003] [Indexed: 11/25/2022] Open
|
14
|
Status of text-mining techniques applied to biomedical text. Drug Discov Today 2007; 11:315-25. [PMID: 16580973 DOI: 10.1016/j.drudis.2006.02.011] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2005] [Revised: 02/08/2006] [Accepted: 02/27/2006] [Indexed: 11/16/2022]
Abstract
Scientific progress is increasingly based on knowledge and information. Knowledge is now recognized as the driver of productivity and economic growth, leading to a new focus on the role of information in the decision-making process. Most scientific knowledge is registered in publications and other unstructured representations that make it difficult to use and to integrate the information with other sources (e.g. biological databases). Making a computer understand human language has proven to be a complex achievement, but there are techniques capable of detecting, distinguishing and extracting a limited number of different classes of facts. In the biomedical field, extracting information has specific problems: complex and ever-changing nomenclature (especially genes and proteins) and the limited representation of domain knowledge.
Collapse
|
15
|
Soft and hard tissue response to zirconium dioxide dental implants--a clinical study in man. NEURO ENDOCRINOLOGY LETTERS 2006; 27 Suppl 1:69-72. [PMID: 16892009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 12/19/2005] [Accepted: 01/16/2006] [Indexed: 05/11/2023]
Abstract
Titanium dental implants have been used successfully in implantology for more than 40 years. Recent research, however, suggests that titanium might have more side effects than previously believed. Zirconia ceramics have been employed in orthopaedic surgery for approximately 30 years and were recently introduced into dentistry as a metal replacement for crown and bridge work as well as implant abutments. Zirconium dioxide has been shown in both in vitro and in vivo studies to have desirable osseointegrative properties. This clinical study shows that dental implants made from zirconia are a feasible alternative to titanium dental implants. In addition to excellent cosmetic results, zirconia implants allow a degree of osseointegration and soft tissue response that is superior to titanium dental implants.
Collapse
|
16
|
BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments. Nucleic Acids Res 2006; 34:W472-6. [PMID: 16845052 PMCID: PMC1538844 DOI: 10.1093/nar/gkl172] [Citation(s) in RCA: 216] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
We present a new version of Babelomics, a complete suite of web tools for functional analysis of genome-scale experiments, with new and improved tools. New functionally relevant terms have been included such as CisRed motifs or bioentities obtained by text-mining procedures. An improved indexing has considerably speeded up several of the modules. An improved version of the FatiScan method for studying the coordinate behaviour of groups of functionally related genes is presented, along with a similar tool, the Gene Set Enrichment Analysis. Babelomics is now more oriented to test systems biology inspired hypotheses. Babelomics can be found at .
Collapse
|
17
|
|
18
|
Abstract
BACKGROUND The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28-31, 2004. The articles collected in this BMC Bioinformatics supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation. RESULTS BioCreAtIvE focused on two tasks. The first dealt with extraction of gene or protein names from text, and their mapping into standardized gene identifiers for three model organism databases (fly, mouse, yeast). The second task addressed issues of functional annotation, requiring systems to identify specific text passages that supported Gene Ontology annotations for specific proteins, given full text articles. CONCLUSION The first BioCreAtIvE assessment achieved a high level of international participation (27 groups from 10 countries). The assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. The results for the advanced task (functional annotation from free text) were significantly lower, demonstrating the current limitations of text-mining approaches where knowledge extrapolation and interpretation are required. In addition, an important contribution of BioCreAtIvE has been the creation and release of training and test data sets for both tasks. There are 22 articles in this special issue, including six that provide analyses of results or data quality for the data sets, including a novel inter-annotator consistency assessment for the test set used in task 2.
Collapse
|
19
|
Abstract
Background Molecular Biology accumulated substantial amounts of data concerning functions of genes and proteins. Information relating to functional descriptions is generally extracted manually from textual data and stored in biological databases to build up annotations for large collections of gene products. Those annotation databases are crucial for the interpretation of large scale analysis approaches using bioinformatics or experimental techniques. Due to the growing accumulation of functional descriptions in biomedical literature the need for text mining tools to facilitate the extraction of such annotations is urgent. In order to make text mining tools useable in real world scenarios, for instance to assist database curators during annotation of protein function, comparisons and evaluations of different approaches on full text articles are needed. Results The Critical Assessment for Information Extraction in Biology (BioCreAtIvE) contest consists of a community wide competition aiming to evaluate different strategies for text mining tools, as applied to biomedical literature. We report on task two which addressed the automatic extraction and assignment of Gene Ontology (GO) annotations of human proteins, using full text articles. The predictions of task 2 are based on triplets of protein – GO term – article passage. The annotation-relevant text passages were returned by the participants and evaluated by expert curators of the GO annotation (GOA) team at the European Institute of Bioinformatics (EBI). Each participant could submit up to three results for each sub-task comprising task 2. In total more than 15,000 individual results were provided by the participants. The curators evaluated in addition to the annotation itself, whether the protein and the GO term were correctly predicted and traceable through the submitted text fragment. Conclusion Concepts provided by GO are currently the most extended set of terms used for annotating gene products, thus they were explored to assess how effectively text mining tools are able to extract those annotations automatically. Although the obtained results are promising, they are still far from reaching the required performance demanded by real world applications. Among the principal difficulties encountered to address the proposed task, were the complex nature of the GO terms and protein names (the large range of variants which are used to express proteins and especially GO terms in free text), and the lack of a standard training set. A range of very different strategies were used to tackle this task. The dataset generated in line with the BioCreative challenge is publicly available and will allow new possibilities for training information extraction methods in the domain of molecular biology.
Collapse
|
20
|
Abstract
The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.
Collapse
|
21
|
The BioLink SIG Workshop at ISMB2004. Comp Funct Genomics 2005; 6:58-60. [PMID: 18629301 PMCID: PMC2448605 DOI: 10.1002/cfg.455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2004] [Accepted: 12/21/2004] [Indexed: 11/10/2022] Open
|
22
|
Abstract
MOTIVATION Genome-wide functional annotation either by manual or automatic means has raised considerable concerns regarding the accuracy of assignments and the reproducibility of methodologies. In addition, a performance evaluation of automated systems that attempt to tackle sequence analyses rapidly and reproducibly is generally missing. In order to quantify the accuracy and reproducibility of function assignments on a genome-wide scale, we have re-annotated the entire genome sequence of Chlamydia trachomatis (serovar D), in a collaborative manner. RESULTS We have encoded all annotations in a structured format to allow further comparison and data exchange and have used a scale that records the different levels of potential annotation errors according to their propensity to propagate in the database due to transitive function assignments. We conclude that genome annotation may entail a considerable amount of errors, ranging from simple typographical errors to complex sequence analysis problems. The most surprising result of this comparative study is that automatic systems might perform as well as the teams of experts annotating genome sequences.
Collapse
|
23
|
Bioinformatics methods for the analysis of expression arrays: data clustering and information extraction. J Biotechnol 2002; 98:269-83. [PMID: 12141992 DOI: 10.1016/s0168-1656(02)00137-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Expression arrays facilitate the monitoring of changes in the expression patterns of large collections of genes. The analysis of expression array data has become a computationally-intensive task that requires the development of bioinformatics technology for a number of key stages in the process, such as image analysis, database storage, gene clustering and information extraction. Here, we review the current trends in each of these areas, with particular emphasis on the development of the related technology being carried out within our groups.
Collapse
|
24
|
The potential use of SUISEKI as a protein interaction discovery tool. GENOME INFORMATICS. INTERNATIONAL CONFERENCE ON GENOME INFORMATICS 2002; 12:123-34. [PMID: 11791231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Relevant information about protein interactions is stored in textual sources. This sources are commonly used not only as archives of what is already known but also as information for generating new knowledge, particularly to pose hypothesis about new possible interactions that can be inferred from the existing ones. This task is the more creative part of scientific work in experimental systems. We present a large-scale analysis for the prediction of new interactions based on the interaction network for the ones already known and detected automatically in the literature. During the last few years it has became clear that part of the information about protein interactions could be extracted with automatic tools, even if these tools are still far from perfect and key problems such as detection of protein names are not completely solved. We have developed a integrated automatic approach, called SUISEKI (System for Information Extraction on Interactions), able to extract protein interactions from collections of Medline abstracts. Previous experiments with the system have shown that it is able to extract almost 70% of the interactions present in relatively large text corpus, with an accuracy of approximately 80% (for the best defined interactions) that makes the system usable in real scenarios, both at the level of extraction of protein names and at the level of extracting interaction between them. With the analysis of the interaction map of Saccharomyces cerevisiae we show that interactions published in the years 2000/2001 frequently correspond to proteins or genes that were already very close in the interaction network deduced from the literature published before these years and that they are often connected to the same proteins. That is, discoveries are commonly done among highly connected entities. Some biologically relevant examples illustrate how interactions described in the year 2000 could have been proposed as reasonable working hypothesis with the information previously available in the automatically extracted network of interactions.
Collapse
|
25
|
Abstract
Information extraction has become a very active field in bioinformatics recently and a number of interesting papers have been published. Most of the efforts have been concentrated on a few specific problems, such as the detection of protein-protein interactions and the analysis of DNA expression arrays, although it is obvious that there are many other interesting areas of potential application (document retrieval, protein functional description, and detection of disease-related genes to name a few). Paradoxically, these exciting developments have not yet crystallised into general agreement on a set of standard evaluation criteria, such as the ones developed in fields such as protein structure prediction, which makes it very difficult to compare performance across these different systems. In this review we introduce the general field of information extraction, we outline the status of the applications in molecular biology, and we then discuss some ideas about possible standards for evaluation that are needed for the future development of the field.
Collapse
|
26
|
Expression profiles and biological function. GENOME INFORMATICS. WORKSHOP ON GENOME INFORMATICS 2002; 11:106-17. [PMID: 11700592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
Expression arrays facilitate the monitoring of changes in expression patterns of large collections of genes. It is generally expected that genes with similar expression patterns would correspond to proteins of common biological function. We assess this common assumption by comparing levels of similarity of expression patterns and statistical significance of biological terms that describe the corresponding protein functions. Terms are automatically obtained by mining large collections of Medline abstracts. We propose that the combined use of the tools for expression profiles clustering and automatic function retrieval, can be useful tools for the detection of biologically relevant associations between genes in complex gene expression experiments. The results obtained using publicly available experimental data show how, in general, an increase in the similarity of the expression patterns is accompanied by an enhancement of the amount of specific functional information or, in other words, how the selected terms became more specific following an increase in the specificity of the expression patterns. Particularly interesting are the discrepancies from this general trend, i.e. groups of genes with similar expression patterns but very little in common at the functional level. In these cases the similarity of their expression profiles becomes the first link between previously unrelated genes.
Collapse
|
27
|
Automatic ontology construction from the literature. GENOME INFORMATICS. INTERNATIONAL CONFERENCE ON GENOME INFORMATICS 2002; 13:201-13. [PMID: 14571389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
Detailed classifications, controlled vocabularies and organised terminology are widely used in different areas of science and technology. Their relatively recent introduction in molecular biology has been crucial for progress in the analysis of genonics and massive proteomics experiments. Unfortunately the construction of the ontologies, including terminology, classification and entity relations requires considerable effort, including the analysis of massive amounts of literature. We propose here a method that automatically generates classifications of gene-product functions using bibliographic information. The corresponding classification structures mirror the ones constructed by human experts. The analysis of a large structure built for yeast gene-products, and the detailed inspection of various examples, show encouraging properties. In particular, the comparison with the well accepted GO ontology points to different situations in which the automatically derived classification can be useful for assisting human experts in the annotation of ontologies.
Collapse
|
28
|
Mining functional information associated with expression arrays. Funct Integr Genomics 2001; 1:256-68. [PMID: 11793245 DOI: 10.1007/s101420000036] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2000] [Accepted: 11/24/2000] [Indexed: 10/27/2022]
Abstract
Deciphering the networks of interactions between molecules in biological systems has gained momentum with the monitoring of gene expression patterns at the genomic scale. Expression array experiments provide vast amounts of experimental data about these networks, the analysis of which requires new computational methods. In particular, issues related to the extraction of biological information are key for the end users. We propose here a strategy, implemented in a system called GEISHA (gene expression information system for human analysis) and able to detect biological terms significantly associated to different gene expression clusters by mining collections of Medline abstracts. GEISHA is based on a comparison of the frequency of abstracts linked to different gene clusters and containing a given term. Interpretation by the end user of the biological meaning of the terms is facilitated by embedding them in the corresponding significant sentences and abstracts and by establishing relations with other, equally significant terms. The information provided by GEISHA for the available yeast expression data compares favorably with the functional annotations provided by human experts, demonstrating the potential value of GEISHA as an assistant for the analysis of expression array experiments.
Collapse
|
29
|
Automatic extraction of biological information from scientific text: protein-protein interactions. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 2000:60-7. [PMID: 10786287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
We describe the basic design of a system for automatic detection of protein-protein interactions extracted from scientific abstracts. By restricting the problem domain and imposing a number of strong assumptions which include pre-specified protein names and a limited set of verbs that represent actions, we show that it is possible to perform accurate information extraction. The performance of the system is evaluated with different cases of real-world interaction networks, including the Drosophila cell cycle control. The results obtained computationally are in good agreement with current biological knowledge and demonstrate the feasibility of developing a fully automated system able to describe networks of protein interactions with sufficient accuracy.
Collapse
|