1
|
Sourati J, Evans JA. Accelerating science with human-aware artificial intelligence. Nat Hum Behav 2023; 7:1682-1696. [PMID: 37443269 DOI: 10.1038/s41562-023-01648-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 06/02/2023] [Indexed: 07/15/2023]
Abstract
Artificial intelligence (AI) models trained on published scientific findings have been used to invent valuable materials and targeted therapies, but they typically ignore the human scientists who continually alter the landscape of discovery. Here we show that incorporating the distribution of human expertise by training unsupervised models on simulated inferences that are cognitively accessible to experts dramatically improves (by up to 400%) AI prediction of future discoveries beyond models focused on research content alone, especially when relevant literature is sparse. These models succeed by predicting human predictions and the scientists who will make them. By tuning human-aware AI to avoid the crowd, we can generate scientifically promising 'alien' hypotheses unlikely to be imagined or pursued without intervention until the distant future, which hold promise to punctuate scientific advance beyond questions currently pursued. By accelerating human discovery or probing its blind spots, human-aware AI enables us to move towards and beyond the contemporary scientific frontier.
Collapse
Affiliation(s)
- Jamshid Sourati
- Department of Sociology, University of Chicago, Chicago, IL, USA
| | - James A Evans
- Department of Sociology, University of Chicago, Chicago, IL, USA.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
2
|
Cuffy C, McInnes BT. Exploring a deep learning neural architecture for closed Literature-based discovery. J Biomed Inform 2023; 143:104362. [PMID: 37146741 DOI: 10.1016/j.jbi.2023.104362] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 03/15/2023] [Accepted: 04/06/2023] [Indexed: 05/07/2023]
Abstract
Scientific literature presents a wealth of information yet to be explored. As the number of researchers increase with each passing year and publications are released, this contributes to an era where specialized fields of research are becoming more prevalent. As this trend continues, this further propagates the separation of interdisciplinary publications and makes keeping up to date with literature a laborious task. Literature-based discovery (LBD) aims to mitigate these concerns by promoting information sharing among non-interacting literature while extracting potentially meaningful information. Furthermore, recent advances in neural network architectures and data representation techniques have fueled their respective research communities in achieving state-of-the-art performance in many downstream tasks. However, studies of neural network-based methods for LBD remain to be explored. We introduce and explore a deep learning neural network-based approach for LBD. Additionally, we investigate various approaches to represent terms as concepts and analyze the affect of feature scaling representations into our model. We compare the evaluation performance of our method on five hallmarks of cancer datasets utilized for closed discovery. Our results show the chosen representation as input into our model affects evaluation performance. We found feature scaling our input representations increases evaluation performance and decreases the necessary number of epochs needed to achieve model generalization. We also explore two approaches to represent model output. We found reducing the model's output to capturing a subset of concepts improved evaluation performance at the cost of model generalizability. We also compare the efficacy of our method on the five hallmarks of cancer datasets to a set of randomly chosen relations between concepts. We found these experiments confirm our method's suitability for LBD.
Collapse
Affiliation(s)
- Clint Cuffy
- Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.
| | - Bridget T McInnes
- Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.
| |
Collapse
|
3
|
Fernandez ME, Nazar FN, Moine LB, Jaime CE, Kembro JM, Correa SG. Network Analysis of Inflammatory Bowel Disease Research: Towards the Interactome. J Crohns Colitis 2022; 16:1651-1662. [PMID: 35439301 DOI: 10.1093/ecco-jcc/jjac059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
BACKGROUND AND AIMS Modern views accept that inflammatory bowel diseases [IBD] emerge from complex interactions among the multiple components of a biological network known as the 'IBD interactome'. These diverse components belong to different functional levels including cells, molecules, genes and biological processes. This diversity can make it difficult to integrate available empirical information from human patients into a collective view of aetiopathogenesis, a necessary step to understand the interactome. Herein, we quantitatively analyse how the representativeness of components involved in human IBD and their relationships ha ve changed over time. METHODS A bibliographic search in PubMed retrieved 25 971 abstracts of experimental studies on IBD in humans, published between 1990 and 2020. Abstracts were scanned automatically for 1218 IBD interactome components proposed in recent reviews. The resulting databases are freely available and were visualized as networks indicating the frequency at which different components are referenced together within each abstract. RESULTS As expected, over time there was an increase in components added to the IBD network and heightened connectivity within and across functional levels. However, certain components were consistently studied together, forming preserved motifs in the networks. These overrepresented and highly linked components reflect main 'hypotheses' in IBD research in humans. Interestingly, 82% of the components cited in reviews were absent or showed low frequency, suggesting that many aspects of the proposed IBD interactome still have weak experimental support in humans. CONCLUSIONS A reductionist and fragmented approach to the study of IBD has prevailed in previous decades, highlighting the importance of transitioning towards a more integrated interactome framework.
Collapse
Affiliation(s)
- M Emilia Fernandez
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Centro de Investigaciones en Bioquímica Clínica e Inmunología (CIBICI), Córdoba, Argentina
| | - F Nicolas Nazar
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Instituto de Investigaciones Biológicas y Tecnológicas (IIByT), Córdoba, Argentina.,Universidad Nacional de Córdoba, Facultad de Ciencias Exactas, Físicas y Naturales, Instituto de Ciencia y Tecnología de los Alimentos (ICTA), Córdoba, Argentina
| | - Luciana B Moine
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Centro de Investigaciones en Bioquímica Clínica e Inmunología (CIBICI), Córdoba, Argentina
| | - Cristian E Jaime
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Centro de Investigaciones en Bioquímica Clínica e Inmunología (CIBICI), Córdoba, Argentina
| | - Jackelyn M Kembro
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Instituto de Investigaciones Biológicas y Tecnológicas (IIByT), Córdoba, Argentina.,Universidad Nacional de Córdoba, Facultad de Ciencias Exactas, Físicas y Naturales, Instituto de Ciencia y Tecnología de los Alimentos (ICTA), Córdoba, Argentina.,Universidad Nacional de Córdoba, Facultad de Ciencias Exactas, Físicas y Naturales, Cátedra de Química Biológica, Córdoba, Argentina
| | - Silvia G Correa
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Centro de Investigaciones en Bioquímica Clínica e Inmunología (CIBICI), Córdoba, Argentina.,Universidad Nacional de Córdoba, Facultad de Ciencias Químicas, Departamento de Bioquímica Clínica, Inmunología, Córdoba, Argentina
| |
Collapse
|
4
|
Lu MC, Hsu CW, Lo HC, Chang HH, Koo M. Association of Clinical Manifestations of Systemic Lupus Erythematosus and Complementary Therapy Use in Taiwanese Female Patients: A Cross-Sectional Study. MEDICINA (KAUNAS, LITHUANIA) 2022; 58:medicina58070944. [PMID: 35888663 PMCID: PMC9317495 DOI: 10.3390/medicina58070944] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/05/2022] [Accepted: 07/13/2022] [Indexed: 11/16/2022]
Abstract
Background and Objectives: Systemic lupus erythematosus (SLE) is a chronic systemic autoimmune disease that affects predominantly women in the childbearing years. Patients may seek complementary therapies to manage their health and to reduce symptoms. However, to our knowledge, no studies have explored the association between clinical manifestations of SLE and complementary therapies. Therefore, this study aimed to investigate the association of complementary therapies with common clinical manifestations in Taiwanese female patients with SLE. Materials and Methods: A cross-sectional study was conducted at a regional teaching hospital in southern Taiwan. Outpatients from the rheumatology clinic who met the inclusion criteria were consecutively recruited. Demographic data, clinical manifestations of SLE, and types of complementary therapy use were determined using paper-based questionnaire. Multiple logistic regression analyses were conducted to investigate the use of complementary therapies associated with clinical manifestations of SLE. Results: Of the 317 female patients with SLE, 60.9% were 40 years or older. The five SLE clinical manifestations with the highest prevalence were Raynaud’s phenomenon (61.2%), photosensitivity (50.2%), Sjögren’s syndrome (28.4%), arthralgia and arthritis (22.1%), and renal involvement (14.5%). Multiple logistic regression analyses revealed that Raynaud’s phenomenon was significantly associated with fitness walking or strolling (adjusted odds ratio [aOR] 1.77; p = 0.027) and fish oil supplements (aOR 3.55, p < 0.001). Photosensitivity was significantly and inversely associated with the use of probiotics (aOR 0.49; p = 0.019). Renal involvement was significantly associated with the use of probiotics (aOR 2.43; p = 0.026) and visit to the Chinese medicine department in a hospital (aOR 3.14, p = 0.026). Conclusions: We found that different clinical manifestations of SLE were associated with the use of different complementary therapies. Health care providers should have up-to-date knowledge of common complementary therapies and be ready to provide evidence-based advice to patients with SLE.
Collapse
Affiliation(s)
- Ming-Chi Lu
- Division of Allergy, Immunology and Rheumatology, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Dalin, Chiayi 622401, Taiwan;
- School of Medicine, Tzu Chi University, Hualien City 97004, Taiwan
| | - Chia-Wen Hsu
- Department of Medical Research, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Dalin, Chiayi 622401, Taiwan; (C.-W.H.); (H.-C.L.); (H.-H.C.)
| | - Hui-Chin Lo
- Department of Medical Research, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Dalin, Chiayi 622401, Taiwan; (C.-W.H.); (H.-C.L.); (H.-H.C.)
| | - Hsiu-Hua Chang
- Department of Medical Research, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Dalin, Chiayi 622401, Taiwan; (C.-W.H.); (H.-C.L.); (H.-H.C.)
| | - Malcolm Koo
- Graduate Institute of Long-Term Care, Tzu Chi University of Science and Technology, Hualien City 970302, Taiwan
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
- Correspondence:
| |
Collapse
|
5
|
Sang S, Liu X, Chen X, Zhao D. A Scalable Embedding Based Neural Network Method for Discovering Knowledge From Biomedical Literature. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1294-1301. [PMID: 32750871 DOI: 10.1109/tcbb.2020.3003947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Nowadays, the amount of biomedical literatures is growing at an explosive speed, and much useful knowledge is yet undiscovered in the literature. Classical information retrieval techniques allow to access explicit information from a given collection of information, but are not able to recognize implicit connections. Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting literature. It could significantly support scientific research by identifying new connections between biomedical entities. However, most of the existing approaches to LBD are not scalable and may not be sufficient to detect complex associations in non-directly-connected literature. In this article, we present a model which incorporates biomedical knowledge graph, graph embedding, and deep learning methods for literature-based discovery. First, the relations between biomedical entities are extracted from biomedical abstracts and then a knowledge graph is constructed by using these obtained relations. Second, the graph embedding technologies are applied to convert the entities and relations in the knowledge graph into a low-dimensional vector space. Third, a bidirectional Long Short-Term Memory (BLSTM) network is trained based on the entity associations represented by the pre-trained graph embeddings. Finally, the learned model is used for open and closed literature-based discovery tasks. The experimental results show that our method could not only effectively discover hidden associations between entities, but also reveal the corresponding mechanism of interactions. It suggests that incorporating knowledge graph and deep learning methods is an effective way for capturing the underlying complex associations between entities hidden in the literature.
Collapse
|
6
|
Phang CSJ, Vong WT, Sebastian Y, Raman V, Then PHH. Understanding the Usability of a Literature-Based Discovery System Among Clinical Researchers in Sarawak, Malaysia. INTERNATIONAL JOURNAL OF TECHNOLOGY AND HUMAN INTERACTION 2022. [DOI: 10.4018/ijthi.304092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The rapid increase in scientific publications makes it difficult for researchers to keep up with the latest literature and to explore new research directions. The literature-based discovery (LBD) systems aim to resolve this issue by bridging literatures from disparate fields to assist researchers in knowledge discovery and the formulation and testing of research hypotheses. Previous studies have focused mainly on evaluating the efficacy of LBD systems by replicating historical LBD events. The usability of LBD systems has been under-researched, which partly explains the low adoption of the systems. This paper presents a survey study that evaluates the usability of a LBD system for knowledge discovery and hypothesis refinement, and also investigates factors affecting its adoption among biomedical researchers in Sarawak, Malaysia. The findings suggest that the adoption of the LBD system is related to their perceived usefulness and perceived difficulty in interacting with the user interface features of the system.
Collapse
Affiliation(s)
| | - Wan-Tze Vong
- Swinburne University of Technology, Sarawak, Malaysia
| | | | | | | |
Collapse
|
7
|
Henry S, Wijesinghe DS, Myers A, McInnes BT. Using Literature Based Discovery to Gain Insights Into the Metabolomic Processes of Cardiac Arrest. Front Res Metr Anal 2021; 6:644728. [PMID: 34250435 PMCID: PMC8267364 DOI: 10.3389/frma.2021.644728] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 05/07/2021] [Indexed: 12/19/2022] Open
Abstract
In this paper, we describe how we applied LBD techniques to discover lecithin cholesterol acyltransferase (LCAT) as a druggable target for cardiac arrest. We fully describe our process which includes the use of high-throughput metabolomic analysis to identify metabolites significantly related to cardiac arrest, and how we used LBD to gain insights into how these metabolites relate to cardiac arrest. These insights lead to our proposal (for the first time) of LCAT as a druggable target; the effects of which are supported by in vivo studies which were brought forth by this work. Metabolites are the end product of many biochemical pathways within the human body. Observed changes in metabolite levels are indicative of changes in these pathways, and provide valuable insights toward the cause, progression, and treatment of diseases. Following cardiac arrest, we observed changes in metabolite levels pre- and post-resuscitation. We used LBD to help discover diseases implicitly linked via these metabolites of interest. Results of LBD indicated a strong link between Fish Eye disease and cardiac arrest. Since fish eye disease is characterized by an LCAT deficiency, it began an investigation into the effects of LCAT and cardiac arrest survival. In the investigation, we found that decreased LCAT activity may increase cardiac arrest survival rates by increasing ω-3 polyunsaturated fatty acid availability in circulation. We verified the effects of ω-3 polyunsaturated fatty acids on increasing survival rate following cardiac arrest via in vivo with rat models.
Collapse
Affiliation(s)
- Sam Henry
- Department of Physics, Computer Science and Engineering, Christopher Newport University, Newport News, VA, United States
| | - D. Shanaka Wijesinghe
- Department of Pharmacotherapy and Outcomes Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Aidan Myers
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bridget T. McInnes
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
8
|
Zhao S, Su C, Lu Z, Wang F. Recent advances in biomedical literature mining. Brief Bioinform 2021; 22:bbaa057. [PMID: 32422651 PMCID: PMC8138828 DOI: 10.1093/bib/bbaa057] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 03/22/2020] [Accepted: 03/25/2020] [Indexed: 01/26/2023] Open
Abstract
The recent years have witnessed a rapid increase in the number of scientific articles in biomedical domain. These literature are mostly available and readily accessible in electronic format. The domain knowledge hidden in them is critical for biomedical research and applications, which makes biomedical literature mining (BLM) techniques highly demanding. Numerous efforts have been made on this topic from both biomedical informatics (BMI) and computer science (CS) communities. The BMI community focuses more on the concrete application problems and thus prefer more interpretable and descriptive methods, while the CS community chases more on superior performance and generalization ability, thus more sophisticated and universal models are developed. The goal of this paper is to provide a review of the recent advances in BLM from both communities and inspire new research directions.
Collapse
Affiliation(s)
- Sendong Zhao
- Department of Healthcare Policy and Research, Weill Medical College of Cornell University, New York, NY 10065, USA
| | - Chang Su
- Division of Health Informatics, Department of Healthcare Policy and Research at Weill Cornell Medicine at Cornell University, New York, NY, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI) at National Library of Medicine, National Institute of Health, Bethesda, MD, USA
| | - Fei Wang
- Department of Healthcare Policy and Research, Weill Medical College of Cornell University, New York, NY 10065, USA
| |
Collapse
|
9
|
|
10
|
Crichton G, Baker S, Guo Y, Korhonen A. Neural networks for open and closed Literature-based Discovery. PLoS One 2020; 15:e0232891. [PMID: 32413059 PMCID: PMC7228051 DOI: 10.1371/journal.pone.0232891] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 04/23/2020] [Indexed: 12/18/2022] Open
Abstract
Literature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their research. LBD can facilitate hypothesis testing and generation and thus accelerate scientific progress. Neural networks have demonstrated improved performance on LBD-related tasks but are yet to be applied to it. We propose four graph-based, neural network methods to perform open and closed LBD. We compared our methods with those used by the state-of-the-art LION LBD system on the same evaluations to replicate recently published findings in cancer biology. We also applied them to a time-sliced dataset of human-curated peer-reviewed biological interactions. These evaluations and the metrics they employ represent performance on real-world knowledge advances and are thus robust indicators of approach efficacy. In the first experiments, our best methods performed 2-4 times better than the baselines in closed discovery and 2-3 times better in open discovery. In the second, our best methods performed almost 2 times better than the baselines in open discovery. These results are strong indications that neural LBD is potentially a very effective approach for generating new scientific discoveries from existing literature. The code for our models and other information can be found at: https://github.com/cambridgeltl/nn_for_LBD.
Collapse
Affiliation(s)
- Gamal Crichton
- Language Technology Laboratory, TAL, University of Cambridge, Cambridge, United Kingdom
| | - Simon Baker
- Language Technology Laboratory, TAL, University of Cambridge, Cambridge, United Kingdom
| | - Yufan Guo
- Language Technology Laboratory, TAL, University of Cambridge, Cambridge, United Kingdom
| | - Anna Korhonen
- Language Technology Laboratory, TAL, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
11
|
Showalter K, Hoffmann A, DeCredico N, Thakrar A, Arroyo E, Goldberg I, Hinchcliff M. Complementary therapies for patients with systemic sclerosis. JOURNAL OF SCLERODERMA AND RELATED DISORDERS 2019; 4:187-199. [PMID: 35382503 PMCID: PMC8922560 DOI: 10.1177/2397198319833503] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 01/27/2019] [Indexed: 11/16/2022]
Abstract
Patients with systemic sclerosis often seek information regarding complementary and nutrition-based therapy. Some study results have shown that vitamins D and E, probiotics, turmeric, l-arginine, essential fatty acids, broccoli, biofeedback, and acupuncture may be beneficial in systemic sclerosis care. However, large randomized clinical trials have not been conducted. This review summarizes current data regarding various complementary therapies in systemic sclerosis and concludes with recommendations.
Collapse
Affiliation(s)
- Kimberly Showalter
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Division of Rheumatology, Hospital for Special Surgery, New York, NY, USA
| | - Aileen Hoffmann
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Nicole DeCredico
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Anjali Thakrar
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Esperanza Arroyo
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Isaac Goldberg
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Monique Hinchcliff
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Department of Medicine, Section of Rheumatology, Allergy and Clinical Immunology, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
12
|
Zhao D, Wang J, Sang S, Lin H, Wen J, Yang C. Relation path feature embedding based convolutional neural network method for drug discovery. BMC Med Inform Decis Mak 2019; 19:59. [PMID: 30961599 PMCID: PMC6454669 DOI: 10.1186/s12911-019-0764-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Drug development is an expensive and time-consuming process. Literature-based discovery has played a critical role in drug development and may be a supplementary method to help scientists speed up the discovery of drugs. METHODS Here, we propose a relation path features embedding based convolutional neural network model with attention mechanism for drug discovery from literature, which we denote as PACNN. First, we use predications from biomedical abstracts to construct a biomedical knowledge graph, and then apply a path ranking algorithm to extract drug-disease relation path features on the biomedical knowledge graph. After that, we use these drug-disease relation features to train a convolutional neural network model which combined with the attention mechanism. Finally, we employ the trained models to mine drugs for treating diseases. RESULTS The experiment shows that the proposed model achieved promising results, comparing to several random walk algorithms. CONCLUSIONS In this paper, we propose a relation path features embedding based convolutional neural network with attention mechanism for discovering potential drugs from literature. Our method could be an auxiliary method for drug discovery, which can speed up the discovery of new drugs for the incurable diseases.
Collapse
Affiliation(s)
- Di Zhao
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Jian Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China.
| | - Shengtian Sang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China.
| | - Hongfei Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Jiabin Wen
- Department of VIP, the Second Hospital of Dalian Medical University, Dalian, China
| | - Chunmei Yang
- Department of VIP, the Second Hospital of Dalian Medical University, Dalian, China
| |
Collapse
|
13
|
Gopalakrishnan V, Jha K, Jin W, Zhang A. A survey on literature based discovery approaches in biomedical domain. J Biomed Inform 2019; 93:103141. [PMID: 30857950 DOI: 10.1016/j.jbi.2019.103141] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 02/17/2019] [Accepted: 02/19/2019] [Indexed: 02/06/2023]
Abstract
Literature Based Discovery (LBD) refers to the problem of inferring new and interesting knowledge by logically connecting independent fragments of information units through explicit or implicit means. This area of research, which incorporates techniques from Natural Language Processing (NLP), Information Retrieval and Artificial Intelligence, has significant potential to reduce discovery time in biomedical research fields. Formally introduced in 1986, LBD has grown to be a significant and a core task for text mining practitioners in the biomedical domain. Together with its inter-disciplinary nature, this has led researchers across domains to contribute in advancing this field of study. This survey attempts to consolidate and present the evolution of techniques in this area. We cover a variety of techniques and provide a detailed description of the problem setting, the intuition, the advantages and limitations of various influential papers. We also list the current bottlenecks in this field and provide a general direction of research activities for the future. In an effort to be comprehensive and for ease of reference for off-the-shelf users, we also list many publicly available tools for LBD. We hope this survey will act as a guide to both academic and industry (bio)-informaticians, introduce the various methodologies currently employed and also the challenges yet to be tackled.
Collapse
Affiliation(s)
| | | | - Wei Jin
- University of North Texas at Denton, TX, United States.
| | | |
Collapse
|
14
|
Lever J, Gakkhar S, Gottlieb M, Rashnavadi T, Lin S, Siu C, Smith M, Jones MR, Krzywinski M, Jones SJM, Wren J. A collaborative filtering-based approach to biomedical knowledge discovery. Bioinformatics 2018; 34:652-659. [PMID: 29028901 DOI: 10.1093/bioinformatics/btx613] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 09/25/2017] [Indexed: 01/19/2023] Open
Abstract
Motivation The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. Results We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. Availability and implementation All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. Contact sjones@bcgsc.ca. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jake Lever
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada.,University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Sitanshu Gakkhar
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Michael Gottlieb
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Tahereh Rashnavadi
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Santina Lin
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Celia Siu
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Maia Smith
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Martin R Jones
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Martin Krzywinski
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada.,University of British Columbia, Vancouver, BC V6T 1Z1, Canada.,Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | | |
Collapse
|
15
|
Mower J, Subramanian D, Cohen T. Learning predictive models of drug side-effect relationships from distributed representations of literature-derived semantic predications. J Am Med Inform Assoc 2018; 25:1339-1350. [PMID: 30010902 PMCID: PMC6454491 DOI: 10.1093/jamia/ocy077] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 04/23/2018] [Accepted: 06/05/2018] [Indexed: 02/01/2023] Open
Abstract
Objective The aim of this work is to leverage relational information extracted from biomedical literature using a novel synthesis of unsupervised pretraining, representational composition, and supervised machine learning for drug safety monitoring. Methods Using ≈80 million concept-relationship-concept triples extracted from the literature using the SemRep Natural Language Processing system, distributed vector representations (embeddings) were generated for concepts as functions of their relationships utilizing two unsupervised representational approaches. Embeddings for drugs and side effects of interest from two widely used reference standards were then composed to generate embeddings of drug/side-effect pairs, which were used as input for supervised machine learning. This methodology was developed and evaluated using cross-validation strategies and compared to contemporary approaches. To qualitatively assess generalization, models trained on the Observational Medical Outcomes Partnership (OMOP) drug/side-effect reference set were evaluated against a list of ≈1100 drugs from an online database. Results The employed method improved performance over previous approaches. Cross-validation results advance the state of the art (AUC 0.96; F1 0.90 and AUC 0.95; F1 0.84 across the two sets), outperforming methods utilizing literature and/or spontaneous reporting system data. Examination of predictions for unseen drug/side-effect pairs indicates the ability of these methods to generalize, with over tenfold label support enrichment in the top 100 predictions versus the bottom 100 predictions. Discussion and Conclusion Our methods can assist the pharmacovigilance process using information from the biomedical literature. Unsupervised pretraining generates a rich relationship-based representational foundation for machine learning techniques to classify drugs in the context of a putative side effect, given known examples.
Collapse
Affiliation(s)
- Justin Mower
- Baylor College of Medicine, Quantitative and Computational Biosciences, Houston, Texas, USA
| | | | - Trevor Cohen
- School of Biomedical Informatics, University of Texas Health Science Center Houston, Texas, USA
| |
Collapse
|
16
|
Henry S, McInnes BT. Literature Based Discovery: Models, methods, and trends. J Biomed Inform 2017; 74:20-32. [PMID: 28838802 DOI: 10.1016/j.jbi.2017.08.011] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 07/21/2017] [Accepted: 08/20/2017] [Indexed: 01/25/2023]
Abstract
OBJECTIVES This paper provides an introduction and overview of literature based discovery (LBD) in the biomedical domain. It introduces the reader to modern and historical LBD models, key system components, evaluation methodologies, and current trends. After completion, the reader will be familiar with the challenges and methodologies of LBD. The reader will be capable of distinguishing between recent LBD systems and publications, and be capable of designing an LBD system for a specific application. TARGET AUDIENCE From biomedical researchers curious about LBD, to someone looking to design an LBD system, to an LBD expert trying to catch up on trends in the field. The reader need not be familiar with LBD, but knowledge of biomedical text processing tools is helpful. SCOPE This paper describes a unifying framework for LBD systems. Within this framework, different models and methods are presented to both distinguish and show overlap between systems. Topics include term and document representation, system components, and an overview of models including co-occurrence models, semantic models, and distributional models. Other topics include uninformative term filtering, term ranking, results display, system evaluation, an overview of the application areas of drug development, drug repurposing, and adverse drug event prediction, and challenges and future directions. A timeline showing contributions to LBD, and a table summarizing the works of several authors is provided. Topics are presented from a high level perspective. References are given if more detailed analysis is required.
Collapse
Affiliation(s)
- Sam Henry
- Department of Computer Science, Virginia Commonwealth University, 401 S. Main St., Rm E4222, Richmond, VA 23284, USA.
| | - Bridget T McInnes
- Department of Computer Science, Virginia Commonwealth University, 401 S. Main St., Rm E4222, Richmond, VA 23284, USA
| |
Collapse
|
17
|
Abstract
Background Most of earlier studies in the field of literature-based discovery have adopted Swanson's ABC model that links pieces of knowledge entailed in disjoint literatures. However, the issue concerning their practicability remains to be solved since most of them did not deal with the context surrounding the discovered associations and usually not accompanied with clinical confirmation. In this study, we aim to propose a method that expands and elaborates the existing hypothesis by advanced text mining techniques for capturing contexts. We extend ABC model to allow for multiple B terms with various biological types. Results We were able to concretize a specific, metabolite-related hypothesis with abundant contextual information by using the proposed method. Starting from explaining the relationship between lactosylceramide and arterial stiffness, the hypothesis was extended to suggest a potential pathway consisting of lactosylceramide, nitric oxide, malondialdehyde, and arterial stiffness. The experiment by domain experts showed that it is clinically valid. Conclusions The proposed method is designed to provide plausible candidates of the concretized hypothesis, which are based on extracted heterogeneous entities and detailed relation information, along with a reliable ranking criterion. Statistical tests collaboratively conducted with biomedical experts provide the validity and practical usefulness of the method unlike previous studies. Applying the proposed method to other cases, it would be helpful for biologists to support the existing hypothesis and easily expect the logical process within it.
Collapse
|
18
|
Abstract
Background Literature based discovery (LBD) automatically infers missed connections between concepts in literature. It is often assumed that LBD generates more information than can be reasonably examined. Methods We present a detailed analysis of the quantity of hidden knowledge produced by an LBD system and the effect of various filtering approaches upon this. The investigation of filtering combined with single or multi-step linking term chains is carried out on all articles in PubMed. Results The evaluation is carried out using both replication of existing discoveries, which provides justification for multi-step linking chain knowledge in specific cases, and using timeslicing, which gives a large scale measure of performance. Conclusions While the quantity of hidden knowledge generated by LBD can be vast, we demonstrate that (a) intelligent filtering can greatly reduce the number of hidden knowledge pairs generated, (b) for a specific term, the number of single step connections can be manageable, and (c) in the absence of single step hidden links, considering multiple steps can provide valid links.
Collapse
Affiliation(s)
- Judita Preiss
- Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, Sheffield, UK.
| | - Mark Stevenson
- Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, Sheffield, UK
| |
Collapse
|
19
|
Abstract
AbstractLiterature-based discovery systems aim at discovering valuable latent connections between previously disparate research areas. This is achieved by analyzing the contents of their respective literatures with the help of various intelligent computational techniques. In this paper, we review the progress of literature-based discovery research, focusing on understanding their technical features and evaluating their performance. The present literature-based discovery techniques can be divided into two general approaches: the traditional approach and the emerging approach. The traditional approach, which dominate the current research landscape, comprises mainly of techniques that rely on utilizing lexical statistics, knowledge-based and visualization methods in order to address literature-based discovery problems. On the other hand, we have also observed the births of new trends and unprecedented paradigm shifts among the recently emerging literature-based discovery approach. These trends are likely to shape the future trajectory of the next generation literature-based discovery systems.
Collapse
|
20
|
Ma X, Jiang Z, Lai C. Significance of Increasing n-3 PUFA Content in Pork on Human Health. Crit Rev Food Sci Nutr 2017; 56:858-70. [PMID: 26237277 DOI: 10.1080/10408398.2013.850059] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Evidence for the health-promoting effects of food rich in n-3 polyunsaturated fatty acids (n-3 PUFA) is reviewed. Pork is an important meat source for humans. According to a report by the US Department of Agriculture ( http://www.ers.usda.gov/topics ), the pork consumption worldwide in 2011 was about 79.3 million tons, much higher than that of beef (48.2 million tons). Pork also contains high levels of unsaturated fatty acids relative to ruminant meats (Enser, M., Hallett, K., Hewett, B., Fursey, G. A. J. and Wood, J. D. (1996) . Fatty acid content and composition of English beef, lamb, and pork at retail. Meat Sci. 44:443-458). The available literature indicates that the levels of eicosatetraenoic and docosahexaenoic in pork may be increased by fish-derived or linseed products, the extent of which being dependent on the nature of the supplementation. Transgenic pigs and plants show promise with high content of n-3 PUFA and low ratio of n-6/n-3 fatty acids in their tissues. The approaches mentioned for decreasing n-6/n-3 ratios have both advantages and disadvantages. Selected articles are critically reviewed and summarized.
Collapse
Affiliation(s)
- Xianyong Ma
- a The Key Laboratory of Animal Nutrition and Feed Science (South China) of Ministry of Agriculture, State Key Laboratory of Livestock and Poultry Breeding, Institute of Animal Science, Guangdong Academy of Agricultural Sciences , Guangzhou , China
| | - Zongyong Jiang
- a The Key Laboratory of Animal Nutrition and Feed Science (South China) of Ministry of Agriculture, State Key Laboratory of Livestock and Poultry Breeding, Institute of Animal Science, Guangdong Academy of Agricultural Sciences , Guangzhou , China
| | - Chaoqiang Lai
- b Nutrition and Genomics Laboratory, JM-USDA Human Nutrition Research Center on Aging at Tufts University , Boston , Massachusetts , USA
| |
Collapse
|
21
|
Abstract
Prebiotics contribute to the well-being of their host by altering the composition of the gut microbiota. Discovering new prebiotics is a challenging and arduous task due to strict inclusion criteria; thus, highly limited numbers of prebiotic candidates have been identified. Notably, the large numbers of published studies may contain substantial information attached to various features of known prebiotics that can be used to predict new candidates. In this paper, we propose a medical subject headings (MeSH)-based text mining method for identifying new prebiotics with structured texts obtained from PubMed. We defined an optimal feature set for prebiotics prediction using a systematic feature-ranking algorithm with which a variety of carbohydrates can be accurately classified into different clusters in accordance with their chemical and biological attributes. The optimal feature set was used to separate positive prebiotics from other carbohydrates, and a cross-validation procedure was employed to assess the prediction accuracy of the model. Our method achieved a specificity of 0.876 and a sensitivity of 0.838. Finally, we identified a high-confidence list of candidates of prebiotics that are strongly supported by the literature. Our study demonstrates that text mining from high-volume biomedical literature is a promising approach in searching for new prebiotics.
Collapse
|
22
|
Kastrin A, Rindflesch TC, Hristovski D. Link Prediction on a Network of Co-occurring MeSH Terms: Towards Literature-based Discovery. Methods Inf Med 2016; 55:340-6. [PMID: 27435341 DOI: 10.3414/me15-01-0108] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 05/19/2016] [Indexed: 12/24/2022]
Abstract
OBJECTIVES Literature-based discovery (LBD) is a text mining methodology for automatically generating research hypotheses from existing knowledge. We mimic the process of LBD as a classification problem on a graph of MeSH terms. We employ unsupervised and supervised link prediction methods for predicting previously unknown connections between biomedical concepts. METHODS We evaluate the effectiveness of link prediction through a series of experiments using a MeSH network that contains the history of link formation between biomedical concepts. We performed link prediction using proximity measures, such as common neighbor (CN), Jaccard coefficient (JC), Adamic / Adar index (AA) and preferential attachment (PA). Our approach relies on the assumption that similar nodes are more likely to establish a link in the future. RESULTS Applying an unsupervised approach, the AA measure achieved the best performance in terms of area under the ROC curve (AUC = 0.76), followed by CN, JC, and PA. In a supervised approach, we evaluate whether proximity measures can be combined to define a model of link formation across all four predictors. We applied various classifiers, including decision trees, k-nearest neighbors, logistic regression, multilayer perceptron, naïve Bayes, and random forests. Random forest classifier accomplishes the best performance (AUC = 0.87). CONCLUSIONS The link prediction approach proved to be effective for LBD processing. Supervised statistical learning approaches clearly outperform an unsupervised approach to link prediction.
Collapse
Affiliation(s)
- Andrej Kastrin
- Andrej Kastrin, PhD, Faculty of Information Studies, Ljubljanska cesta 31A, SI-8000 Novo Mesto, Slovenia, E-mail:
| | | | | |
Collapse
|
23
|
Song M, Heo GE, Ding Y. SemPathFinder: Semantic path analysis for discovering publicly unknown knowledge. J Informetr 2015. [DOI: 10.1016/j.joi.2015.06.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
24
|
Supervised Learning Based Hypothesis Generation from Biomedical Literature. BIOMED RESEARCH INTERNATIONAL 2015; 2015:698527. [PMID: 26380291 PMCID: PMC4561867 DOI: 10.1155/2015/698527] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 04/12/2015] [Accepted: 05/24/2015] [Indexed: 11/18/2022]
Abstract
Nowadays, the amount of biomedical literatures is growing at an explosive speed, and there is much useful knowledge undiscovered in this literature. Researchers can form biomedical hypotheses through mining these works. In this paper, we propose a supervised learning based approach to generate hypotheses from biomedical literature. This approach splits the traditional processing of hypothesis generation with classic ABC model into AB model and BC model which are constructed with supervised learning method. Compared with the concept cooccurrence and grammar engineering-based approaches like SemRep, machine learning based models usually can achieve better performance in information extraction (IE) from texts. Then through combining the two models, the approach reconstructs the ABC model and generates biomedical hypotheses from literature. The experimental results on the three classic Swanson hypotheses show that our approach outperforms SemRep system.
Collapse
|
25
|
Cameron D, Kavuluru R, Rindflesch TC, Sheth AP, Thirunarayan K, Bodenreider O. Context-driven automatic subgraph creation for literature-based discovery. J Biomed Inform 2015; 54:141-57. [PMID: 25661592 PMCID: PMC4888806 DOI: 10.1016/j.jbi.2015.01.014] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 01/21/2015] [Accepted: 01/25/2015] [Indexed: 01/29/2023]
Abstract
BACKGROUND Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting scientific literature. Prior approaches to LBD include use of: (1) domain expertise and structured background knowledge to manually filter and explore the literature, (2) distributional statistics and graph-theoretic measures to rank interesting connections, and (3) heuristics to help eliminate spurious connections. However, manual approaches to LBD are not scalable and purely distributional approaches may not be sufficient to obtain insights into the meaning of poorly understood associations. While several graph-based approaches have the potential to elucidate associations, their effectiveness has not been fully demonstrated. A considerable degree of a priori knowledge, heuristics, and manual filtering is still required. OBJECTIVES In this paper we implement and evaluate a context-driven, automatic subgraph creation method that captures multifaceted complex associations between biomedical concepts to facilitate LBD. Given a pair of concepts, our method automatically generates a ranked list of subgraphs, which provide informative and potentially unknown associations between such concepts. METHODS To generate subgraphs, the set of all MEDLINE articles that contain either of the two specified concepts (A, C) are first collected. Then binary relationships or assertions, which are automatically extracted from the MEDLINE articles, called semantic predications, are used to create a labeled directed predications graph. In this predications graph, a path is represented as a sequence of semantic predications. The hierarchical agglomerative clustering (HAC) algorithm is then applied to cluster paths that are bounded by the two concepts (A, C). HAC relies on implicit semantics captured through Medical Subject Heading (MeSH) descriptors, and explicit semantics from the MeSH hierarchy, for clustering. Paths that exceed a threshold of semantic relatedness are clustered into subgraphs based on their shared context. Finally, the automatically generated clusters are provided as a ranked list of subgraphs. RESULTS The subgraphs generated using this approach facilitated the rediscovery of 8 out of 9 existing scientific discoveries. In particular, they directly (or indirectly) led to the recovery of several intermediates (or B-concepts) between A- and C-terms, while also providing insights into the meaning of the associations. Such meaning is derived from predicates between the concepts, as well as the provenance of the semantic predications in MEDLINE. Additionally, by generating subgraphs on different thematic dimensions (such as Cellular Activity, Pharmaceutical Treatment and Tissue Function), the approach may enable a broader understanding of the nature of complex associations between concepts. Finally, in a statistical evaluation to determine the interestingness of the subgraphs, it was observed that an arbitrary association is mentioned in only approximately 4 articles in MEDLINE on average. CONCLUSION These results suggest that leveraging the implicit and explicit semantics provided by manually assigned MeSH descriptors is an effective representation for capturing the underlying context of complex associations, along multiple thematic dimensions in LBD situations.
Collapse
Affiliation(s)
- Delroy Cameron
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Wright State University, Dayton, OH 45435, USA.
| | - Ramakanth Kavuluru
- Division of Biomedical Informatics, University of Kentucky, Lexington, KY 40506, USA
| | | | - Amit P Sheth
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Wright State University, Dayton, OH 45435, USA
| | - Krishnaprasad Thirunarayan
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Wright State University, Dayton, OH 45435, USA
| | | |
Collapse
|
26
|
Cheng L, Lin H, Zhou F, Yang Z, Wang J. Enhancing the accuracy of knowledge discovery: a supervised learning method. BMC Bioinformatics 2014; 15 Suppl 12:S9. [PMID: 25474584 PMCID: PMC4243114 DOI: 10.1186/1471-2105-15-s12-s9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Background The amount of biomedical literature available is growing at an explosive speed, but a large amount of useful information remains undiscovered in it. Researchers can make informed biomedical hypotheses through mining this literature. Unfortunately, popular mining methods based on co-occurrence produce too many target concepts, leading to the declining relevance ranking of the potential target concepts. Methods This paper presents a new method for selecting linking concepts which exploits statistical and textual features to represent each linking concept, and then classifies them as relevant or irrelevant to the starting concepts. Relevant linking concepts are then used to discover target concepts. Results Through an evaluation it is observed textual features improve the results obtained with only statistical features. We successfully replicate Swanson's two classic discoveries and find the rankings of potentially relevant target concepts are relatively high. Conclusions The number of target concepts is greatly reduced and potentially relevant target concepts gain higher ranking by adopting only relevant linking concepts. Thus, the proposed method has the potential to help biomedical experts find the most useful and valuable target concepts effectively.
Collapse
|
27
|
Bays HE, Tighe AP, Sadovsky R, Davidson MH. Prescription omega-3 fatty acids and their lipid effects: physiologic mechanisms of action and clinical implications. Expert Rev Cardiovasc Ther 2014; 6:391-409. [DOI: 10.1586/14779072.6.3.391] [Citation(s) in RCA: 177] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
28
|
Choi J, Kim K, Song M, Lee D. Generation and application of drug indication inference models using typed network motif comparison analysis. BMC Med Inform Decis Mak 2013; 13 Suppl 1:S2. [PMID: 23566076 PMCID: PMC3618246 DOI: 10.1186/1472-6947-13-s1-s2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023] Open
Abstract
Background As the amount of publicly available biomedical data increases, discovering hidden knowledge from biomedical data (i.e., Undiscovered Public Knowledge (UPK) proposed by Swanson) became an important research topic in the biological literature mining field. Drug indication inference, or drug repositioning, is one of famous UPK tasks, which infers alternative indications for approved drugs. Many previous studies tried to find novel candidate indications of existing drugs, but these works have following limitations: 1) models are not fully automated which required manual modulations to desired tasks, 2) are not able to cover various biomedical entities, and 3) have inference limitations that those works could infer only pre-defined cases using limited patterns. To overcome these problems, we suggest a new drug indication inference model. Methods In this paper, we adopted the Typed Network Motif Comparison Algorithm (TNMCA) to infer novel drug indications using topology of given network. Typed Network Motifs (TNM) are network motifs, which store types of data, instead of values of data. TNMCA is a powerful inference algorithm for multi-level biomedical interaction data as TNMs depend on the different types of entities and relations. We utilized a new normalized scoring function as well as network exclusion to improve the inference results. To validate our method, we applied TNMCA to a public database, Comparative Toxicogenomics Database (CTD). Results The results show that enhanced TNMCA was able to infer meaningful indications with high performance (AUC = 0.801, 0.829) compared to the ABC model (AUC = 0.7050) and previous TNMCA model (AUC = 0.5679, 0.7469). The literature analysis also shows that TNMCA inferred meaningful results. Conclusions We proposed and enhanced a novel drug indication inference model by incorporating topological patterns of given network. By utilizing inference models from the topological patterns, we were able to improve inference power in drug indication inferences.
Collapse
Affiliation(s)
- Jaejoon Choi
- Department of Bio and Brain Engineering, KAIST, Daejeon, South Korea
| | | | | | | |
Collapse
|
29
|
|
30
|
How the nutritional value and consumer acceptability of suckling lambs meat is affected by the maternal feeding system. Small Rumin Res 2012. [DOI: 10.1016/j.smallrumres.2012.02.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
31
|
Cohen T, Widdows D, Schvaneveldt RW, Davies P, Rindflesch TC. Discovering discovery patterns with Predication-based Semantic Indexing. J Biomed Inform 2012; 45:1049-65. [PMID: 22841748 DOI: 10.1016/j.jbi.2012.07.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Revised: 07/09/2012] [Accepted: 07/15/2012] [Indexed: 01/16/2023]
Abstract
In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as "discovery patterns", such as "drug x INHIBITS substance y, substance y CAUSES disease z" that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues.
Collapse
Affiliation(s)
- Trevor Cohen
- University of Texas Health Science Center, Houston, TX, USA.
| | | | | | | | | |
Collapse
|
32
|
Lee S, Choi J, Park K, Song M, Lee D. Discovering context-specific relationships from biological literature by using multi-level context terms. BMC Med Inform Decis Mak 2012; 12 Suppl 1:S1. [PMID: 22595086 PMCID: PMC3339396 DOI: 10.1186/1472-6947-12-s1-s1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Background The Swanson's ABC model is powerful to infer hidden relationships buried in biological literature. However, the model is inadequate to infer relations with context information. In addition, the model generates a very large amount of candidates from biological text, and it is a semi-automatic, labor-intensive technique requiring human expert's manual input. To tackle these problems, we incorporate context terms to infer relations between AB interactions and BC interactions. Methods We propose 3 steps to discover meaningful hidden relationships between drugs and diseases: 1) multi-level (gene, drug, disease, symptom) entity recognition, 2) interaction extraction (drug-gene, gene-disease) from literature, 3) context vector based similarity score calculation. Subsequently, we evaluate our hypothesis with the datasets of the "Alzheimer's disease" related 77,711 PubMed abstracts. As golden standards, PharmGKB and CTD databases are used. Evaluation is conducted in 2 ways: first, comparing precision of the proposed method and the previous method and second, analysing top 10 ranked results to examine whether highly ranked interactions are truly meaningful or not. Results The results indicate that context-based relation inference achieved better precision than the previous ABC model approach. The literature analysis also shows that interactions inferred by the context-based approach are more meaningful than interactions by the previous ABC model. Conclusions We propose a novel interaction inference technique that incorporates context term vectors into the ABC model to discover meaningful hidden relationships. By utilizing multi-level context terms, our model shows better performance than the previous ABC model.
Collapse
Affiliation(s)
- Sejoon Lee
- Bio and Brain Engineering Department, KAIST, Daejeon 305-701, South Korea
| | | | | | | | | |
Collapse
|
33
|
Using noun phrases for navigating biomedical literature on Pubmed: how many updates are we losing track of? PLoS One 2011; 6:e24920. [PMID: 21935487 PMCID: PMC3173492 DOI: 10.1371/journal.pone.0024920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2011] [Accepted: 08/19/2011] [Indexed: 11/19/2022] Open
Abstract
Author-supplied citations are a fraction of the related literature for a paper. The “related citations” on PubMed is typically dozens or hundreds of results long, and does not offer hints why these results are related. Using noun phrases derived from the sentences of the paper, we show it is possible to more transparently navigate to PubMed updates through search terms that can associate a paper with its citations. The algorithm to generate these search terms involved automatically extracting noun phrases from the paper using natural language processing tools, and ranking them by the number of occurrences in the paper compared to the number of occurrences on the web. We define search queries having at least one instance of overlap between the author-supplied citations of the paper and the top 20 search results as citation validated (CV). When the overlapping citations were written by same authors as the paper itself, we define it as CV-S and different authors is defined as CV-D. For a systematic sample of 883 papers on PubMed Central, at least one of the search terms for 86% of the papers is CV-D versus 65% for the top 20 PubMed “related citations.” We hypothesize these quantities computed for the 20 million papers on PubMed to differ within 5% of these percentages. Averaged across all 883 papers, 5 search terms are CV-D, and 10 search terms are CV-S, and 6 unique citations validate these searches. Potentially related literature uncovered by citation-validated searches (either CV-S or CV-D) are on the order of ten per paper – many more if the remaining searches that are not citation-validated are taken into account. The significance and relationship of each search result to the paper can only be vetted and explained by a researcher with knowledge of or interest in that paper.
Collapse
|
34
|
Andronis C, Sharma A, Virvilis V, Deftereos S, Persidis A. Literature mining, ontologies and information visualization for drug repurposing. Brief Bioinform 2011; 12:357-68. [PMID: 21712342 DOI: 10.1093/bib/bbr005] [Citation(s) in RCA: 163] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The immense growth of MEDLINE coupled with the realization that a vast amount of biomedical knowledge is recorded in free-text format, has led to the appearance of a large number of literature mining techniques aiming to extract biomedical terms and their inter-relations from the scientific literature. Ontologies have been extensively utilized in the biomedical domain either as controlled vocabularies or to provide the framework for mapping relations between concepts in biology and medicine. Literature-based approaches and ontologies have been used in the past for the purpose of hypothesis generation in connection with drug discovery. Here, we review the application of literature mining and ontology modeling and traversal to the area of drug repurposing (DR). In recent years, DR has emerged as a noteworthy alternative to the traditional drug development process, in response to the decreased productivity of the biopharmaceutical industry. Thus, systematic approaches to DR have been developed, involving a variety of in silico, genomic and high-throughput screening technologies. Attempts to integrate literature mining with other types of data arising from the use of these technologies as well as visualization tools assisting in the discovery of novel associations between existing drugs and new indications will also be presented.
Collapse
|
35
|
Deftereos SN, Andronis C, Friedla EJ, Persidis A, Persidis A. Drug repurposing and adverse event prediction using high-throughput literature analysis. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 3:323-34. [PMID: 21416632 DOI: 10.1002/wsbm.147] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Drug repurposing is the process of using existing drugs in indications other than the ones they were originally designed for. It is an area of significant recent activity due to the mounting costs of traditional drug development and scarcity of new chemical entities brought to the market by bio-pharmaceutical companies. By selecting drugs that already satisfy basic toxicity, ADME and related criteria, drug repurposing promises to deliver significant value at reduced cost and in dramatically shorter time frames than is normally the case for the drug development process. The same process that results in drug repurposing can also be used for the prediction of adverse events of known or novel drugs. The analytics method is based on the description of the mechanism of action of a drug, which is then compared to the molecular mechanisms underlying all known adverse events. This review will focus on those approaches to drug repurposing and adverse event prediction that are based on the biomedical literature. Such approaches typically begin with an analysis of the literature and aim to reveal indirect relationships among seemingly unconnected biomedical entities such as genes, signaling pathways, physiological processes, and diseases. Networks of associations of these entities allow the uncovering of the molecular mechanisms underlying a disease, better understanding of the biological effects of a drug and the evaluation of its benefit/risk profile. In silico results can be tested in relevant cellular and animal models and, eventually, in clinical trials.
Collapse
|
36
|
Harmston N, Filsell W, Stumpf MPH. What the papers say: text mining for genomics and systems biology. Hum Genomics 2010; 5:17-29. [PMID: 21106487 PMCID: PMC3500154 DOI: 10.1186/1479-7364-5-1-17] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2010] [Accepted: 08/06/2010] [Indexed: 12/11/2022] Open
Abstract
Keeping up with the rapidly growing literature has become virtually impossible for most scientists. This can have dire consequences. First, we may waste research time and resources on reinventing the wheel simply because we can no longer maintain a reliable grasp on the published literature. Second, and perhaps more detrimental, judicious (or serendipitous) combination of knowledge from different scientific disciplines, which would require following disparate and distinct research literatures, is rapidly becoming impossible for even the most ardent readers of research publications. Text mining - the automated extraction of information from (electronically) published sources - could potentially fulfil an important role - but only if we know how to harness its strengths and overcome its weaknesses. As we do not expect that the rate at which scientific results are published will decrease, text mining tools are now becoming essential in order to cope with, and derive maximum benefit from, this information explosion. In genomics, this is particularly pressing as more and more rare disease-causing variants are found and need to be understood. Not being conversant with this technology may put scientists and biomedical regulators at a severe disadvantage. In this review, we introduce the basic concepts underlying modern text mining and its applications in genomics and systems biology. We hope that this review will serve three purposes: (i) to provide a timely and useful overview of the current status of this field, including a survey of present challenges; (ii) to enable researchers to decide how and when to apply text mining tools in their own research; and (iii) to highlight how the research communities in genomics and systems biology can help to make text mining from biomedical abstracts and texts more straightforward.
Collapse
Affiliation(s)
- Nathan Harmston
- Division of Molecular Biosciences, Centre for Bioinformatics, Imperial College London, 303, Wolfson Building, South Kensington Campus, London, SW7 2AZ, UK
| | - Wendy Filsell
- Unilever R&D, Colworth Science Park, Sharnbrook, Bedford MK44 1 LQ, UK
| | - Michael PH Stumpf
- Division of Molecular Biosciences, Centre for Bioinformatics, Imperial College London, 303, Wolfson Building, South Kensington Campus, London, SW7 2AZ, UK
| |
Collapse
|
37
|
Frijters R, van Vugt M, Smeets R, van Schaik R, de Vlieg J, Alkema W. Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Comput Biol 2010; 6. [PMID: 20885778 PMCID: PMC2944780 DOI: 10.1371/journal.pcbi.1000943] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2010] [Accepted: 08/26/2010] [Indexed: 01/19/2023] Open
Abstract
The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. The biomedical literature is an important source of knowledge on the function of genes and on the mechanisms by which these genes regulate cellular processes. Several text mining approaches have been developed to leverage this rich source of information by automatically extracting associations between concepts such as genes, diseases and drugs from a large body of text. Here, we describe a new method that extracts novel, not yet recognized associations between genes, diseases, drugs and cellular processes from the biomedical literature. Our method is built on the assumption that even if two concepts do not have a direct connection in literature, they may be functionally related if they are both connected to an overlapping set of concepts. Using this approach we predicted several novel connections between genes, diseases, drugs and pathways. Our results imply that our method is able to predict novel relationships from literature and, most importantly, that these newly identified relationships are biologically relevant. Our method can aid the drug discovery process where it can be used to find novel drug targets, increase insight in mode of action of a drug or find novel applications for known drugs.
Collapse
Affiliation(s)
- Raoul Frijters
- Computational Drug Discovery (CDD), Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - Marianne van Vugt
- Department of Immune Therapeutics, Schering-Plough, Oss, The Netherlands
| | - Ruben Smeets
- Department of Immune Therapeutics, Schering-Plough, Oss, The Netherlands
| | - René van Schaik
- Department of Molecular Design & Informatics, Schering-Plough, Oss, The Netherlands
| | - Jacob de Vlieg
- Computational Drug Discovery (CDD), Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
- Department of Molecular Design & Informatics, Schering-Plough, Oss, The Netherlands
| | - Wynand Alkema
- Department of Molecular Design & Informatics, Schering-Plough, Oss, The Netherlands
- * E-mail:
| |
Collapse
|
38
|
Daley CA, Abbott A, Doyle PS, Nader GA, Larson S. A review of fatty acid profiles and antioxidant content in grass-fed and grain-fed beef. Nutr J 2010; 9:10. [PMID: 20219103 PMCID: PMC2846864 DOI: 10.1186/1475-2891-9-10] [Citation(s) in RCA: 384] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2009] [Accepted: 03/10/2010] [Indexed: 11/10/2022] Open
Abstract
Growing consumer interest in grass-fed beef products has raised a number of questions with regard to the perceived differences in nutritional quality between grass-fed and grain-fed cattle. Research spanning three decades suggests that grass-based diets can significantly improve the fatty acid (FA) composition and antioxidant content of beef, albeit with variable impacts on overall palatability. Grass-based diets have been shown to enhance total conjugated linoleic acid (CLA) (C18:2) isomers, trans vaccenic acid (TVA) (C18:1 t11), a precursor to CLA, and omega-3 (n-3) FAs on a g/g fat basis. While the overall concentration of total SFAs is not different between feeding regimens, grass-finished beef tends toward a higher proportion of cholesterol neutral stearic FA (C18:0), and less cholesterol-elevating SFAs such as myristic (C14:0) and palmitic (C16:0) FAs. Several studies suggest that grass-based diets elevate precursors for Vitamin A and E, as well as cancer fighting antioxidants such as glutathione (GT) and superoxide dismutase (SOD) activity as compared to grain-fed contemporaries. Fat conscious consumers will also prefer the overall lower fat content of a grass-fed beef product. However, consumers should be aware that the differences in FA content will also give grass-fed beef a distinct grass flavor and unique cooking qualities that should be considered when making the transition from grain-fed beef. In addition, the fat from grass-finished beef may have a yellowish appearance from the elevated carotenoid content (precursor to Vitamin A). It is also noted that grain-fed beef consumers may achieve similar intakes of both n-3 and CLA through the consumption of higher fat grain-fed portions.
Collapse
Affiliation(s)
- Cynthia A Daley
- College of Agriculture, California State University, Chico, CA, USA
| | - Amber Abbott
- College of Agriculture, California State University, Chico, CA, USA
| | - Patrick S Doyle
- College of Agriculture, California State University, Chico, CA, USA
| | - Glenn A Nader
- University of California Cooperative Extension Service, Davis, CA, USA
| | - Stephanie Larson
- University of California Cooperative Extension Service, Davis, CA, USA
| |
Collapse
|
39
|
Effectiveness of Interventions of Specific Complaints of the Arm, Neck, or Shoulder (CANS). Clin J Pain 2009; 25:537-52. [DOI: 10.1097/ajp.0b013e31819ff52c] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
40
|
Hu X, Zhang X, Yoo I, Wang X, Feng J. Mining hidden connections among biomedical concepts from disjoint biomedical literature sets through semantic-based association rule. INT J INTELL SYST 2009. [DOI: 10.1002/int.20396] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
41
|
García-Carrasco M, Jiménez-Hernández M, Escárcega RO, Mendoza-Pinto C, Pardo-Santos R, Levy R, Maldonado CG, Chávez GP, Cervera R. Treatment of Raynaud's phenomenon. Autoimmun Rev 2008; 8:62-8. [DOI: 10.1016/j.autrev.2008.07.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2008] [Accepted: 07/12/2008] [Indexed: 10/21/2022]
|
42
|
Walton A. Efficacy of myofascial release techniques in the treatment of primary Raynaud's phenomenon. J Bodyw Mov Ther 2008; 12:274-80. [DOI: 10.1016/j.jbmt.2007.12.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2007] [Revised: 08/28/2007] [Accepted: 12/04/2007] [Indexed: 12/01/2022]
|
43
|
Jelier R, Schuemie MJ, Veldhoven A, Dorssers LCJ, Jenster G, Kors JA. Anni 2.0: a multipurpose text-mining tool for the life sciences. Genome Biol 2008; 9:R96. [PMID: 18549479 PMCID: PMC2481428 DOI: 10.1186/gb-2008-9-6-r96] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2008] [Revised: 04/07/2008] [Accepted: 06/12/2008] [Indexed: 01/19/2023] Open
Abstract
Anni 2.0 provides an ontology-based interface to MEDLINE. Anni 2.0 is an online tool () to aid the biomedical researcher with a broad range of information needs. Anni provides an ontology-based interface to MEDLINE and retrieves documents and associations for several classes of biomedical concepts, including genes, drugs and diseases, with established text-mining technology. In this article we illustrate Anni's usability by applying the tool to two use cases: interpretation of a set of differentially expressed genes, and literature-based knowledge discovery.
Collapse
Affiliation(s)
- Rob Jelier
- Department of Medical Informatics, Erasmus MC University Medical Center, Dr, Molewaterplein, Rotterdam, 3015 GE, The Netherlands.
| | | | | | | | | | | |
Collapse
|
44
|
Abstract
Raynaud's phenomenon is a common disorder with vasospasm of the digital arteries causing pallor with cyanosis and/or rubor. It can be primary (idiopathic), where it is not associated with other diseases, or secondary to several diseases or conditions, including connective tissue diseases, such as scleroderma and systemic lupus erythematosus. Raynaud's is often mild enough to not require treatment; however, with secondary Raynaud's there is not only vasospasm but also fixed blood vessel defects so the ischaemia can be more severe. Complications can include digital ulcers and could, rarely, lead to amputation. Treatment is often non-pharmacological including avoiding cold and smoking cessation. Calcium channel antagonists, such as nifedipine, are often considered when treatment is needed; however, adverse effects of these drugs can include hypotension, vasodilatation, peripheral oedema and headaches. Other treatments have been studied in randomised, controlled trials including classes of drugs, such as angiotensin II inhibitors, selective serotonin reuptake inhibitors, phosphodiesterase-5 inhibitors (e.g. sildenafil), nitrates (topical or oral; the latter can be limited by adverse effects, such as flushing, headache and hypotension), and for more serious Raynaud's or its complications prostacyclin agonists may be used. There are two large studies that demonstrate that endothelin receptor blockade with bosentan can reduce the number of new digital ulcers in scleroderma patients. However, it does not affect the healing period. Thus, Raynaud's is common and often requires non-pharmacological treatment. When secondary Raynaud's is suspected, such as Raynaud's with an older age at onset or other features of connective tissue disease, then an appropriate history, physical examination and laboratory tests may be indicated to reach an appropriate diagnosis. There have been advances in pharmacological treatment, but some of the treatments are limited by adverse effects.
Collapse
Affiliation(s)
- Janet E Pope
- Department of Medicine, Division of Rheumatology, London, Ontario, Canada.
| |
Collapse
|
45
|
Wright CI, Kroner CI, Draijer R. Non-invasive methods and stimuli for evaluating the skin's microcirculation. J Pharmacol Toxicol Methods 2006; 54:1-25. [PMID: 16256378 DOI: 10.1016/j.vascn.2005.09.004] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2005] [Accepted: 09/21/2005] [Indexed: 11/17/2022]
Abstract
Vessels in the skin are arranged into superficial and deep horizontal plexuses and they are involved in thermoregulation, oxygen and nutritional support. The skin has a large number of functions and broad appeal spanning basic mechanistic and clinical research. Indeed, the skin can be used as a marker of normal and impaired vascular control and, owing to its accessibility and frequent involvement, is easy to investigate non-invasively. A large number of non-invasive methods are available for investigating the skin, ranging from those that permit the visualisation of microvessels, to those that monitor blood flow or one of its derivatives (e.g., skin temperature and transcutaneous oxygen). Such methods can be combined with non-invasive, dynamic stimuli (e.g., the use of cold or warm stimuli, activation of the peripheral nervous system or local neuronal systems, and the topical application of vasoactive drugs) and this potentially enables the differentiation of underlying disorders (e.g., primary from secondary Raynaud's phenomenon) and also to quantify changes over time or following intervention. The present article outlines the non-invasive methods and dynamic tests that can be used to investigate the microcirculation of the skin.
Collapse
Affiliation(s)
- C I Wright
- Unilever Food and Health Research Institute, Unilever R&D Vlaardingen, Olivier van Noortlaan 120, PO Box 114, 3130 AC Vlaardingen, The Netherlands.
| | | | | |
Collapse
|
46
|
|
47
|
Integration of Instance-Based Learning and Text Mining for Identification of Potential Virus/Bacterium as Bio-terrorism Weapons. INTELLIGENCE AND SECURITY INFORMATICS 2006. [PMCID: PMC7114991 DOI: 10.1007/11760146_55] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
There are some viruses and bacteria that have been identified as bioterrorism weapons. However, there are a lot other viruses and bacteria that can be potential bioterrorism weapons. A system that can automatically suggest potential bioterrorism weapons will help laypeople to discover these suspicious viruses and bacteria. In this paper we apply instance-based learning & text mining approach to identify candidate viruses and bacteria as potential bio-terrorism weapons from biomedical literature. We first take text mining approach to identify topical terms of existed viruses (bacteria) from PubMed separately. Then, we use the term lists as instances to build matrices with the remaining viruses (bacteria) to discover how much the term lists describe the remaining viruses (bacteria). Next, we build a algorithm to rank all remaining viruses (bacteria). We suspect that the higher the ranking of the virus (bacterium) is, the more suspicious they will be potential bio-terrorism weapon. Our findings are intended as a guide to the virus and bacterium literature to support further studies that might then lead to appropriate defense and public health measures.
Collapse
|
48
|
Hu X. Mining novel connections from large online digital library using biomedical ontologies. LIBRARY MANAGEMENT 2005. [DOI: 10.1108/01435120510596107] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
49
|
|
50
|
Kantor P, Muresan G, Roberts F, Zeng DD, Wang FY, Chen H, Merkle RC. Mining Candidate Viruses as Potential Bio-terrorism Weapons from Biomedical Literature. INTELLIGENCE AND SECURITY INFORMATICS 2005. [PMCID: PMC7120915 DOI: 10.1007/11427995_6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this paper we present a semantic-based data mining approach to identify candidate viruses as potential bio-terrorism weapons from biomedical literature. We first identify all the possible properties of viruses as search key words based on Geissler’s 13 criteria; the identified properties are then defined using MeSH terms. Then, we assign each property an importance weight based on domain experts’ judgment. After generating all the possible valid combinations of the properties, we search the biomedical literature, retrieving all the relevant documents. Next our method extracts virus names from the downloaded documents for each search keyword and identifies the novel connection of the virus according to these 4 properties. If a virus is found in the different document sets obtained by several search keywords, the virus should be considered as suspicious and treated as candidate viruses for bio-terrorism. Our findings are intended as a guide to the virus literature to support further studies that might then lead to appropriate defense and public health measures.
Collapse
Affiliation(s)
- Paul Kantor
- Department of Library and Information Science, Rutgers University,
| | - Gheorghe Muresan
- School of Communication, Information and Library Studies, Rutgers University, 4 Huntington Street, 08901-1071 New Brunswick, NJ USA
| | - Fred Roberts
- Artificial Solutions, Altonaer Poststraße 13b, 22767 Hamburg, Germany
| | - Daniel D. Zeng
- MIS Department, University of Arizona, 85721 Tucson, AZ USA
| | - Fei-Yue Wang
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Hsinchun Chen
- Department of Management Information Systems, Eller College of Management, The University of Arizona, 85721 AZ USA
| | - Ralph C. Merkle
- College of Computing, Georgia Tech Information Security Center, Georgia Institute of Technology, 801 Atlantic Drive, 30332-0280 Atlanta, GA USA
| |
Collapse
|