1
|
Tanemura N, Sasaki T, Miyamoto R, Watanabe J, Araki M, Sato J, Chiba T. Extracting the latent needs of dementia patients and caregivers from transcribed interviews in japanese: an initial assessment of the availability of morpheme selection as input data with Z-scores in machine learning. BMC Med Inform Decis Mak 2023; 23:203. [PMID: 37798639 PMCID: PMC10557300 DOI: 10.1186/s12911-023-02303-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 09/20/2023] [Indexed: 10/07/2023] Open
Abstract
BACKGROUND Given the increasing number of dementia patients worldwide, a new method was developed for machine learning models to identify the 'latent needs' of patients and caregivers to facilitate patient/public involvement in societal decision making. METHODS Japanese transcribed interviews with 53 dementia patients and caregivers were used. A new morpheme selection method using Z-scores was developed to identify trends in describing the latent needs. F-measures with and without the new method were compared using three machine learning models. RESULTS The F-measures with the new method were higher for the support vector machine (SVM) (F-measure of 0.81 with the new method and F-measure of 0.79 without the new method for patients) and Naive Bayes (F-measure of 0.69 with the new method and F-measure of 0.67 without the new method for caregivers and F-measure of 0.75 with the new method and F-measure of 0.73 without the new method for patients). CONCLUSION A new scheme based on Z-score adaptation for machine learning models was developed to predict the latent needs of dementia patients and their caregivers by extracting data from interviews in Japanese. However, this study alone cannot be used to assign significance to the adaptation of the new method because of no enough size of sample dataset. Such pre-selection with Z-score adaptation from text data in machine learning models should be considered with more modified suitable methods in the near future.
Collapse
Affiliation(s)
- Nanae Tanemura
- National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senriokashinmachi, Settsu, Osaka, 566-0002, Japan.
| | - Tsuyoshi Sasaki
- Department of Child Psychiatry and Psychiatry, Chiba University Hospital, Chiba, Japan
| | | | - Jin Watanabe
- Kimura Information Technology Co., Ltd, Saga, Japan
| | - Michihiro Araki
- National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senriokashinmachi, Settsu, Osaka, 566-0002, Japan
| | - Junko Sato
- Office of International Programs, Pharmaceuticals and Medical Devices Agency, Tokyo, Japan
| | - Tsuyoshi Chiba
- National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senriokashinmachi, Settsu, Osaka, 566-0002, Japan
| |
Collapse
|
2
|
Oliveira Dos Santos Á, Sergio da Silva E, Machado Couto L, Valadares Labanca Reis G, Silva Belo V. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform 2023; 142:104389. [PMID: 37187321 DOI: 10.1016/j.jbi.2023.104389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/11/2023] [Accepted: 05/08/2023] [Indexed: 05/17/2023]
Abstract
OBJECTIVE Evidence-based medicine (EBM) is a decision-making process based on the conscious and judicious use of the best available scientific evidence. However, the exponential increase in the amount of information currently available likely exceeds the capacity of human-only analysis. In this context, artificial intelligence (AI) and its branches such as machine learning (ML) can be used to facilitate human efforts in analyzing the literature to foster EBM. The present scoping review aimed to examine the use of AI in the automation of biomedical literature survey and analysis with a view to establishing the state-of-the-art and identifying knowledge gaps. MATERIALS AND METHODS Comprehensive searches of the main databases were performed for articles published up to June 2022 and studies were selected according to inclusion and exclusion criteria. Data were extracted from the included articles and the findings categorized. RESULTS The total number of records retrieved from the databases was 12,145, of which 273 were included in the review. Classification of the studies according to the use of AI in evaluating the biomedical literature revealed three main application groups, namely assembly of scientific evidence (n=127; 47%), mining the biomedical literature (n=112; 41%) and quality analysis (n=34; 12%). Most studies addressed the preparation of systematic reviews, while articles focusing on the development of guidelines and evidence synthesis were the least frequent. The biggest knowledge gap was identified within the quality analysis group, particularly regarding methods and tools that assess the strength of recommendation and consistency of evidence. CONCLUSION Our review shows that, despite significant progress in the automation of biomedical literature surveys and analyses in recent years, intense research is needed to fill knowledge gaps on more difficult aspects of ML, deep learning and natural language processing, and to consolidate the use of automation by end-users (biomedical researchers and healthcare professionals).
Collapse
Affiliation(s)
| | - Eduardo Sergio da Silva
- Federal University of São João del-Rei, Campus Centro-Oeste Dona Lindu, Divinópolis, Minas Gerais, Brazil.
| | - Letícia Machado Couto
- Federal University of São João del-Rei, Campus Centro-Oeste Dona Lindu, Divinópolis, Minas Gerais, Brazil.
| | | | - Vinícius Silva Belo
- Federal University of São João del-Rei, Campus Centro-Oeste Dona Lindu, Divinópolis, Minas Gerais, Brazil.
| |
Collapse
|
3
|
Text representation model of scientific papers based on fusing multi-viewpoint information and its quality assessment. Scientometrics 2021. [DOI: 10.1007/s11192-021-04028-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
4
|
Zhao S, Su C, Lu Z, Wang F. Recent advances in biomedical literature mining. Brief Bioinform 2021; 22:bbaa057. [PMID: 32422651 PMCID: PMC8138828 DOI: 10.1093/bib/bbaa057] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 03/22/2020] [Accepted: 03/25/2020] [Indexed: 01/26/2023] Open
Abstract
The recent years have witnessed a rapid increase in the number of scientific articles in biomedical domain. These literature are mostly available and readily accessible in electronic format. The domain knowledge hidden in them is critical for biomedical research and applications, which makes biomedical literature mining (BLM) techniques highly demanding. Numerous efforts have been made on this topic from both biomedical informatics (BMI) and computer science (CS) communities. The BMI community focuses more on the concrete application problems and thus prefer more interpretable and descriptive methods, while the CS community chases more on superior performance and generalization ability, thus more sophisticated and universal models are developed. The goal of this paper is to provide a review of the recent advances in BLM from both communities and inspire new research directions.
Collapse
Affiliation(s)
- Sendong Zhao
- Department of Healthcare Policy and Research, Weill Medical College of Cornell University, New York, NY 10065, USA
| | - Chang Su
- Division of Health Informatics, Department of Healthcare Policy and Research at Weill Cornell Medicine at Cornell University, New York, NY, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI) at National Library of Medicine, National Institute of Health, Bethesda, MD, USA
| | - Fei Wang
- Department of Healthcare Policy and Research, Weill Medical College of Cornell University, New York, NY 10065, USA
| |
Collapse
|
5
|
Mouriño-García MA, Pérez-Rodríguez R, Anido-Rifón LE. A Bag of Concepts Approach for Biomedical Document Classification Using Wikipedia Knowledge*. Spanish-English Cross-language Case Study. Methods Inf Med 2017; 56:370-376. [PMID: 28816337 DOI: 10.3414/me17-01-0028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 07/07/2017] [Indexed: 11/09/2022]
Abstract
OBJECTIVES The ability to efficiently review the existing literature is essential for the rapid progress of research. This paper describes a classifier of text documents, represented as vectors in spaces of Wikipedia concepts, and analyses its suitability for classification of Spanish biomedical documents when only English documents are available for training. We propose the cross-language concept matching (CLCM) technique, which relies on Wikipedia interlanguage links to convert concept vectors from the Spanish to the English space. METHODS The performance of the classifier is compared to several baselines: a classifier based on machine translation, a classifier that represents documents after performing Explicit Semantic Analysis (ESA), and a classifier that uses a domain-specific semantic annotator (MetaMap). The corpus used for the experiments (Cross-Language UVigoMED) was purpose-built for this study, and it is composed of 12,832 English and 2,184 Spanish MEDLINE abstracts. RESULTS The performance of our approach is superior to any other state-of-the art classifier in the benchmark, with performance increases up to: 124% over classical machine translation, 332% over MetaMap, and 60 times over the classifier based on ESA. The results have statistical significance, showing p-values < 0.0001. CONCLUSION Using knowledge mined from Wikipedia to represent documents as vectors in a space of Wikipedia concepts and translating vectors between language-specific concept spaces, a cross-language classifier can be built, and it performs better than several state-of-the-art classifiers.
Collapse
|
6
|
Yan S, Wong KC. Elucidating high-dimensional cancer hallmark annotation via enriched ontology. J Biomed Inform 2017; 73:84-94. [PMID: 28723579 DOI: 10.1016/j.jbi.2017.07.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 05/23/2017] [Accepted: 07/14/2017] [Indexed: 10/19/2022]
Abstract
MOTIVATION Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. RESULTS To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. AVAILABILITY https://github.com/cskyan/chmannot.
Collapse
Affiliation(s)
- Shankai Yan
- Department of Computer Science, City University of Hong Kong, Hong Kong Special Administrative Region
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong Special Administrative Region.
| |
Collapse
|
7
|
Krallinger M, Rabal O, Lourenço A, Oyarzabal J, Valencia A. Information Retrieval and Text Mining Technologies for Chemistry. Chem Rev 2017; 117:7673-7761. [PMID: 28475312 DOI: 10.1021/acs.chemrev.6b00851] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.
Collapse
Affiliation(s)
- Martin Krallinger
- Structural Computational Biology Group, Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre , C/Melchor Fernández Almagro 3, Madrid E-28029, Spain
| | - Obdulia Rabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
| | - Anália Lourenço
- ESEI - Department of Computer Science, University of Vigo , Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense E-32004, Spain.,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia) , Campus Universitario Lagoas-Marcosende, Vigo E-36310, Spain.,CEB-Centre of Biological Engineering, University of Minho , Campus de Gualtar, Braga 4710-057, Portugal
| | - Julen Oyarzabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
| | - Alfonso Valencia
- Life Science Department, Barcelona Supercomputing Centre (BSC-CNS) , C/Jordi Girona, 29-31, Barcelona E-08034, Spain.,Joint BSC-IRB-CRG Program in Computational Biology, Parc Científic de Barcelona , C/ Baldiri Reixac 10, Barcelona E-08028, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA) , Passeig de Lluís Companys 23, Barcelona E-08010, Spain
| |
Collapse
|
8
|
|
9
|
Garcia-Gathright JI, Oh A, Abarca PA, Han M, Sago W, Spiegel ML, Wolf B, Garon EB, Bui AAT, Aberle DR. Representing and extracting lung cancer study metadata: study objective and study design. Comput Biol Med 2015; 58:63-72. [PMID: 25618216 DOI: 10.1016/j.compbiomed.2015.01.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Revised: 12/06/2014] [Accepted: 01/02/2015] [Indexed: 10/24/2022]
Abstract
This paper describes the information retrieval step in Casama (Contextualized Semantic Maps), a project that summarizes and contextualizes current research papers on driver mutations in non-small cell lung cancer. Casama׳s representation of lung cancer studies aims to capture elements that will assist an end-user in retrieving studies and, importantly, judging their strength. This paper focuses on two types of study metadata: study objective and study design. 430 abstracts on EGFR and ALK mutations in lung cancer were annotated manually. Casama׳s support vector machine (SVM) automatically classified the abstracts by study objective with as much as 129% higher F-scores compared to PubMed׳s built-in filters. A second SVM classified the abstracts by epidemiological study design, suggesting strength of evidence at a more granular level than in previous work. The classification results and the top features determined by the classifiers suggest that this scheme would be generalizable to other mutations in lung cancer, as well as studies on driver mutations in other cancer domains.
Collapse
Affiliation(s)
- Jean I Garcia-Gathright
- Department of Bioengineering, University of California, 924 Westwood Boulevard, Suite 420, Los Angeles, CA 90024, USA.
| | - Andrea Oh
- Department of Radiological Sciences, University of California, 924 Westwood Boulevard, Suite 420, Los Angeles, CA 90024, USA
| | - Phillip A Abarca
- Department of Medicine - Division of Hematology-Oncology, University of California, 924 Westwood Boulevard, Suite 200, Los Angeles, CA 90024, USA
| | - Mary Han
- Department of Medicine - Division of Hematology-Oncology, University of California, 924 Westwood Boulevard, Suite 200, Los Angeles, CA 90024, USA
| | - William Sago
- Department of Medicine - Division of Hematology-Oncology, University of California, 924 Westwood Boulevard, Suite 200, Los Angeles, CA 90024, USA
| | - Marshall L Spiegel
- Department of Medicine - Division of Hematology-Oncology, University of California, 924 Westwood Boulevard, Suite 200, Los Angeles, CA 90024, USA
| | - Brian Wolf
- Department of Medicine - Division of Hematology-Oncology, University of California, 924 Westwood Boulevard, Suite 200, Los Angeles, CA 90024, USA
| | - Edward B Garon
- Department of Medicine - Division of Hematology-Oncology, University of California, 924 Westwood Boulevard, Suite 200, Los Angeles, CA 90024, USA
| | - Alex A T Bui
- Department of Bioengineering, University of California, 924 Westwood Boulevard, Suite 420, Los Angeles, CA 90024, USA; Department of Radiological Sciences, University of California, 924 Westwood Boulevard, Suite 420, Los Angeles, CA 90024, USA
| | - Denise R Aberle
- Department of Bioengineering, University of California, 924 Westwood Boulevard, Suite 420, Los Angeles, CA 90024, USA; Department of Radiological Sciences, University of California, 924 Westwood Boulevard, Suite 420, Los Angeles, CA 90024, USA
| |
Collapse
|
10
|
Abstract
BACKGROUND Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists of abstracts that the user judges to be 'interesting'. Some methods go further by allowing the user to provide a second input set of 'uninteresting' abstracts; these two input sets are then used to search and rank literature by relevance. In this work we present the service 'Caipirini' (http://caipirini.org) that also allows two input sets, but takes the novel approach of allowing ranking of literature based on one or more sets of genes. RESULTS To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in Arabidopsis thaliana. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets. CONCLUSIONS To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments.
Collapse
|
11
|
Natural Language Processing, Electronic Health Records, and Clinical Research. HEALTH INFORMATICS 2012. [DOI: 10.1007/978-1-84882-448-5_16] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
|
12
|
Alex B, Grover C, Haddow B, Kabadjov M, Klein E, Matthews M, Tobin R, Wang X. Automating curation using a natural language processing pipeline. Genome Biol 2008; 9 Suppl 2:S10. [PMID: 18834488 PMCID: PMC2559981 DOI: 10.1186/gb-2008-9-s2-s10] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general. Results: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average. Conclusion: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems.
Collapse
Affiliation(s)
- Beatrice Alex
- School of Informatics, University of Edinburgh, Edinburgh, UK.
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Yu W, Clyne M, Dolan SM, Yesupriya A, Wulf A, Liu T, Khoury MJ, Gwinn M. GAPscreener: an automatic tool for screening human genetic association literature in PubMed using the support vector machine technique. BMC Bioinformatics 2008; 9:205. [PMID: 18430222 PMCID: PMC2387176 DOI: 10.1186/1471-2105-9-205] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2007] [Accepted: 04/22/2008] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Synthesis of data from published human genetic association studies is a critical step in the translation of human genome discoveries into health applications. Although genetic association studies account for a substantial proportion of the abstracts in PubMed, identifying them with standard queries is not always accurate or efficient. Further automating the literature-screening process can reduce the burden of a labor-intensive and time-consuming traditional literature search. The Support Vector Machine (SVM), a well-established machine learning technique, has been successful in classifying text, including biomedical literature. The GAPscreener, a free SVM-based software tool, can be used to assist in screening PubMed abstracts for human genetic association studies. RESULTS The data source for this research was the HuGE Navigator, formerly known as the HuGE Pub Lit database. Weighted SVM feature selection based on a keyword list obtained by the two-way z score method demonstrated the best screening performance, achieving 97.5% recall, 98.3% specificity and 31.9% precision in performance testing. Compared with the traditional screening process based on a complex PubMed query, the SVM tool reduced by about 90% the number of abstracts requiring individual review by the database curator. The tool also ascertained 47 articles that were missed by the traditional literature screening process during the 4-week test period. We examined the literature on genetic associations with preterm birth as an example. Compared with the traditional, manual process, the GAPscreener both reduced effort and improved accuracy. CONCLUSION GAPscreener is the first free SVM-based application available for screening the human genetic association literature in PubMed with high recall and specificity. The user-friendly graphical user interface makes this a practical, stand-alone application. The software can be downloaded at no charge.
Collapse
Affiliation(s)
- Wei Yu
- National Office of Public Health Genomics, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
14
|
Lin BK, Clyne M, Walsh M, Gomez O, Yu W, Gwinn M, Khoury MJ. Tracking the epidemiology of human genes in the literature: the HuGE Published Literature database. Am J Epidemiol 2006; 164:1-4. [PMID: 16641305 DOI: 10.1093/aje/kwj175] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Completion of the human genome sequence has inspired a new wave of epidemiologic studies on the prevalence of gene variants and their associations with diseases in human populations. In 2001, the Human Genome Epidemiology (HuGE) Network launched the HuGE Published Literature database (HuGE Pub Lit), a searchable, online knowledge base of published, population-based epidemiologic studies of human genes. The database contains links to PubMed articles and can be searched by gene, disease, interacting factor, type of study design or analysis, or any combination of terms in these categories. The search output contains a link to each identified article, along with a table summarizing key features of the reported study. As of September 6, 2005, some 17,665 articles were indexed in the database. Most described gene-disease associations (86%); fewer evaluated gene-gene or gene-environment interactions (17%), the prevalence of gene variants (10%), or genetic tests (3%). Although not comprehensive, this database is a unique tool for epidemiologic researchers and others concerned with the role of genetic variation in population health. Here, the authors provide an overview of the database and its characteristics and uses.
Collapse
Affiliation(s)
- Bruce K Lin
- Office of Genomics and Disease Prevention, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA 30341, USA
| | | | | | | | | | | | | |
Collapse
|