1
|
Novoa J, Chagoyen M, Benito C, Moreno FJ, Pazos F. PMIDigest: Interactive Review of Large Collections of PubMed Entries to Distill Relevant Information. Genes (Basel) 2023; 14:genes14040942. [PMID: 37107700 PMCID: PMC10137743 DOI: 10.3390/genes14040942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 03/14/2023] [Accepted: 04/18/2023] [Indexed: 04/29/2023] Open
Abstract
Scientific knowledge is being accumulated in the biomedical literature at an unprecedented pace. The most widely used database with biomedicine-related article abstracts, PubMed, currently contains more than 36 million entries. Users performing searches in this database for a subject of interest face thousands of entries (articles) that are difficult to process manually. In this work, we present an interactive tool for automatically digesting large sets of PubMed articles: PMIDigest (PubMed IDs digester). The system allows for classification/sorting of articles according to different criteria, including the type of article and different citation-related figures. It also calculates the distribution of MeSH (medical subject headings) terms for categories of interest, providing in a picture of the themes addressed in the set. These MeSH terms are highlighted in the article abstracts in different colors depending on the category. An interactive representation of the interarticle citation network is also presented in order to easily locate article "clusters" related to particular subjects, as well as their corresponding "hub" articles. In addition to PubMed articles, the system can also process a set of Scopus or Web of Science entries. In summary, with this system, the user can have a "bird's eye view" of a large set of articles and their main thematic tendencies and obtain additional information not evident in a plain list of abstracts.
Collapse
Affiliation(s)
- Jorge Novoa
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Darwin, 3, 28049 Madrid, Spain
| | - Mónica Chagoyen
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Darwin, 3, 28049 Madrid, Spain
| | - Carlos Benito
- Instituto de Gestión de la Innovación y del Conocimiento, INGENIO (CSIC and U. Politécnica de Valencia), Edificio 8E, Cam. de Vera, 46022 Valencia, Spain
| | - F Javier Moreno
- Instituto de Investigación en Ciencias de la Alimentación (CIAL), CSIC-UAM, CEI (UAM+CSIC), Nicolás Cabrera, 9, 28049 Madrid, Spain
| | - Florencio Pazos
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Darwin, 3, 28049 Madrid, Spain
| |
Collapse
|
2
|
Khanali J, Malekpour MR, Kolahi AA. Improved dynamics of sharing research findings in the COVID-19 epidemic compared with the SARS and Ebola epidemics. BMC Public Health 2021; 21:105. [PMID: 33422049 PMCID: PMC7794630 DOI: 10.1186/s12889-020-10116-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 12/22/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND When a new or re-emergent pathogen, such as SARS-CoV-2, causes a major outbreak, rapid access to pertinent research findings is crucial for planning strategies and decision making. We researched whether the speed of sharing research results in the COVID-19 epidemic was higher than the SARS and Ebola epidemics. We also researched whether there is any difference in the most frequent topics investigated before and after the COVID-19, SARS, and Ebola epidemics started. METHODS We used PubMed database search tools to determine the time-period it took for the number of articles to rise after the epidemics started and the most frequent topics assigned to the articles. RESULTS The main results were, first, the rise in the number of articles occurred 6 weeks after the COVID-19 epidemic started whereas, this rise occurred 4 months after the SARS and 7 months after the Ebola epidemics started. Second, etiology, statistics & numerical data, and epidemiology were the three most frequent topics investigated in the COVID-19 epidemic. However, etiology, microbiology, and genetics in the SARS epidemic, and statistics & numerical data, epidemiology, and prevention & control in the Ebola epidemic were more frequently studied compared with other topics. Third, some topics were studied more frequently after the epidemics started. CONCLUSIONS The speed of sharing results in the COVID-19 epidemic was much higher than the SARS and Ebola epidemics, and that there is a difference in the most frequent articles' topics investigated in these three epidemics. Due to the value of time in controlling epidemics spread, the study highlights the necessity of defining more solutions for rapidly providing pertinent research findings in fighting against the next public health emergency.
Collapse
Affiliation(s)
- Javad Khanali
- Social Determinants of Health Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad-Reza Malekpour
- Social Determinants of Health Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ali-Asghar Kolahi
- Social Determinants of Health Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
3
|
Murugesu S, Galazis N, Jones BP, Chan M, Bracewell-Milnes T, Ahmed-Salim Y, Grewal K, Timmerman D, Yazbek J, Bourne T, Saso S. Evaluating the use of telemedicine in gynaecological practice: a systematic review. BMJ Open 2020; 10:e039457. [PMID: 33293306 PMCID: PMC7722813 DOI: 10.1136/bmjopen-2020-039457] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
OBJECTIVES The aim of this systematic review is to examine the use of telemedicine in the delivery and teaching of gynaecological clinical practice. To our knowledge, no other systematic review has assessed this broad topic. DESIGN Systematic review of all studies investigating the use of telemedicine in the provision of gynaecological care and education. The search for eligible studies followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines and focused on three online databases: PubMed, Science Direct and SciFinder. ELIGIBILITY CRITERIA Only studies within gynaecology were considered for this review. Studies covering only obstetrics and with minimal information on gynaecology, or clinical medicine in general were excluded. All English language, peer-reviewed human studies were included. Relevant studies published up to the date of final submission of this review were considered with no restrictions to the publication year. DATA EXTRACTIONS AND SYNTHESIS Data extracted included author details, year of publication and country of the study, study aim, sample size, methodology, sample characteristics, outcome measures and a summary of findings. Data extraction and qualitative assessment were performed by the first author and crossed checked by the second author. Quality assessment for each study was assessed using the Newcastle-Ottawa scale. RESULTS A literature search carried out in August 2020 yielded 313 records published between 1992 and 2018. Following a rigorous selection process, only 39 studies were included for this review published between 2000 and 2018. Of these, 19 assessed gynaecological clinical practice, eight assessed gynaecological education, one both, and 11 investigated the feasibility of telemedicine within gynaecological practice. 19 studies were classified as good, 12 fair and eight poor using the Newcastle-Ottawa scale. Telecolposcopy and abortion care were two areas where telemedicine was found to be effective in potentially speeding up diagnosis as well as providing patients with a wide range of management options. Studies focusing on education demonstrated that telementoring could improve teaching in a range of scenarios such as live surgery and international teleconferencing. CONCLUSIONS The results of this review are promising and demonstrate that telemedicine has a role to play in improving clinical effectiveness and education within gynaecology. Its applications have been shown to be safe and effective in providing remote care and training. In the future, randomised controlled studies involving larger numbers of patients and operators with measurable outcomes are required in order to be able to draw reliable conclusions.
Collapse
Affiliation(s)
- Sughashini Murugesu
- Obstetrics and Gynaecology, Hillingdon Hospital NHS Trust, Uxbridge, UK
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
| | - Nicolas Galazis
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Obstetrics and Gynaecology, Northwick Park Hospital, Harrow, London, UK
| | - Benjamin P Jones
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Institute for Reproductive Development and Biology, Imperial College London, London, UK
| | - Maxine Chan
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Institute for Reproductive Development and Biology, Imperial College London, London, UK
| | | | - Yousra Ahmed-Salim
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
| | - Karen Grewal
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Institute for Reproductive Development and Biology, Imperial College London, London, UK
| | - Dirk Timmerman
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Obstetrics and Gynaecology, University Hospitals KU Leuven, Leuven, Belgium
| | - Joseph Yazbek
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
| | - Tom Bourne
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Institute for Reproductive Development and Biology, Imperial College London, London, UK
- Obstetrics and Gynaecology, University Hospitals KU Leuven, Leuven, Belgium
| | - Srdjan Saso
- Queen Charlotte's Hospital, Imperial College Healthcare NHS Trust, London, UK
- Institute for Reproductive Development and Biology, Imperial College London, London, UK
| |
Collapse
|
4
|
Structure of communities in semantic networks of biomedical research on disparities in health and sexism. ACTA ACUST UNITED AC 2020; 40:702-721. [PMID: 33275349 PMCID: PMC7808772 DOI: 10.7705/biomedica.5182] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Indexed: 01/12/2023]
Abstract
Introducción. Como una iniciativa para mejorar la calidad de la atención sanitaria, en la investigación biomédica se ha incrementado la tendencia centrada en el estudio de las disparidades en salud y sexismo. Objetivo. Caracterizar la evidencia científica sobre la disparidad en salud definida como la brecha existente entre la distribución de la salud y el posible sesgo por sexo en el acceso a los servicios médicos. Materiales y métodos. Se hizo una búsqueda simultánea de la literatura científica en la base de datos Medline PubMed de dos descriptores fundamentales: Healthcare disparities y Sexism. Posteriormente, se construyó una red semántica principal y se determinaron algunas subunidades estructurales (comunidades) para el análisis de los patrones de organización de la información. Se utilizó el programa de código abierto Cytoscape para el analisis y la visualización de las redes y el MapEquation, para la detección de comunidades. Asimismo, se desarrolló código ex profeso disponible en un repositorio de acceso público. Resultados. El corpus de la red principal mostró que los términos sobre las enfermedades del corazón fueron los descriptores de condiciones médicas más concurrentes. A partir de las subunidades estructurales, se determinaron los patrones de información relacionada con las políticas públicas, los servicios de salud, los factores sociales determinantes y los factores de riesgo, pero con cierta tendencia a mantenerse indirectamente conectados con los nodos relacionados con condiciones médicas. Conclusiones. La evidencia científica indica que la disparidad por sexo sí importa para la calidad de la atención de muchas enfermedades, especialmente aquellas relacionadas con el sistema circulatorio. Sin embargo, aún se percibe un distanciamiento entre los factores médicos y los sociales que dan lugar a las posibles disparidades por sexo.
Collapse
|
5
|
Ilgisonis EV, Kiseleva OI, Lisitsa AV, Poverennaya EV, Toporkova MN, Ponomarenko EA. [Medical subject headings for the scientific groups evolution analysis on the example of academician A.I. Archakov's scientific school]. BIOMEDIT︠S︡INSKAI︠A︡ KHIMII︠A︡ 2020; 66:7-17. [PMID: 32116222 DOI: 10.18097/pbmc20206601007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
This paper proposes a method of comparative analysis of scientific trajectories based on bibliographic profiles. The bibliographic profile ("meshprint") is a list of MeSH terms (key terms used to index articles in the PubMed), indicating the relative frequency of occurrence of each term in the scientist's articles. Comparison of personalized bibliographic profiles can be represented in the form of a semantic network, where the nodes are the names of scientists, and the relationships are proportional to the calculated measures of similarity of bibliographic profiles. The proposed method was used to analyze the semantic network of scientists united by the academic school of the academician A.I. Archakov. The results of the work allowed us to show the relationship between the scientific trajectories of one scientific school and to correlate the results with world trends.
Collapse
Affiliation(s)
| | - O I Kiseleva
- Institute of Biomedical Chemistry, Moscow, Russia
| | - A V Lisitsa
- Institute of Biomedical Chemistry, Moscow, Russia
| | | | | | | |
Collapse
|
6
|
Spiro A, Fernández García J, Yanover C. Inferring new relations between medical entities using literature curated term co-occurrences. JAMIA Open 2020; 2:378-385. [PMID: 31984370 PMCID: PMC6951958 DOI: 10.1093/jamiaopen/ooz022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/05/2019] [Accepted: 06/08/2019] [Indexed: 11/17/2022] Open
Abstract
Objectives Identifying new relations between medical entities, such as drugs, diseases, and side effects, is typically a resource-intensive task, involving experimentation and clinical trials. The increased availability of related data and curated knowledge enables a computational approach to this task, notably by training models to predict likely relations. Such models rely on meaningful representations of the medical entities being studied. We propose a generic features vector representation that leverages co-occurrences of medical terms, linked with PubMed citations. Materials and Methods We demonstrate the usefulness of the proposed representation by inferring two types of relations: a drug causes a side effect and a drug treats an indication. To predict these relations and assess their effectiveness, we applied 2 modeling approaches: multi-task modeling using neural networks and single-task modeling based on gradient boosting machines and logistic regression. Results These trained models, which predict either side effects or indications, obtained significantly better results than baseline models that use a single direct co-occurrence feature. The results demonstrate the advantage of a comprehensive representation. Discussion Selecting the appropriate representation has an immense impact on the predictive performance of machine learning models. Our proposed representation is powerful, as it spans multiple medical domains and can be used to predict a wide range of relation types. Conclusion The discovery of new relations between various medical entities can be translated into meaningful insights, for example, related to drug development or disease understanding. Our representation of medical entities can be used to train models that predict such relations, thus accelerating healthcare-related discoveries.
Collapse
Affiliation(s)
- Adam Spiro
- Machine Learning for Healthcare and Life Sciences, Department of Health Informatics, IBM Research, Haifa, Israel
| | - Jonatan Fernández García
- Machine Learning for Healthcare and Life Sciences, Department of Health Informatics, IBM Research, Haifa, Israel
| | - Chen Yanover
- Machine Learning for Healthcare and Life Sciences, Department of Health Informatics, IBM Research, Haifa, Israel
| |
Collapse
|
7
|
Research status and hotspots of economic evaluation in nursing by co-word clustering analysis. FRONTIERS OF NURSING 2019. [DOI: 10.2478/fon-2019-0031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Abstract
Objective
The aim of this study is to discover research status and hotspots of economic evaluation (EE) in nursing area using co-word cluster analysis.
Methods
Medical Subject Heading (MeSH) term “cost–benefit analysis” was searched in PubMed and nursing journals were limited by the function of filter. The information of author, country, year, journal, and keywords of collected paper was extracted and exported to Bicomb 2.0 system, where high-frequency terms and other data could be further mined. SPSS 19.0 was used for cluster analysis to generate dendrogram.
Results
In all, 3,020 articles were found and 10,573 MeSH terms were detected; among them, 1,909 were MeSH major topics and generated 42 high-frequency terms. The consequence of dendrogram showed seven clusters, representing seven research hotspots: skin administration, infection prevention, education program, nurse education and management, EE research, neoplasm patient, and extension of nurse function.
Conclusions
Nursing EE research involved multiple aspects in nursing area, which is an important indicator for decision-making. Although the number of papers is increasing, the quality of study is not promising. Therefore, further study may be required to detect nurses’ knowledge of economic analysis method and their attitude to apply it into nursing research. More nursing economics course could carry out in nursing school or hospitals.
Collapse
|
8
|
Research Trend Visualization by MeSH Terms from PubMed. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2018; 15:ijerph15061113. [PMID: 29848974 PMCID: PMC6025283 DOI: 10.3390/ijerph15061113] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 05/28/2018] [Accepted: 05/29/2018] [Indexed: 11/17/2022]
Abstract
Motivation: PubMed is a primary source of biomedical information comprising search tool function and the biomedical literature from MEDLINE which is the US National Library of Medicine premier bibliographic database, life science journals and online books. Complimentary tools to PubMed have been developed to help the users search for literature and acquire knowledge. However, these tools are insufficient to overcome the difficulties of the users due to the proliferation of biomedical literature. A new method is needed for searching the knowledge in biomedical field. Methods: A new method is proposed in this study for visualizing the recent research trends based on the retrieved documents corresponding to a search query given by the user. The Medical Subject Headings (MeSH) are used as the primary analytical element. MeSH terms are extracted from the literature and the correlations between them are calculated. A MeSH network, called MeSH Net, is generated as the final result based on the Pathfinder Network algorithm. Results: A case study for the verification of proposed method was carried out on a research area defined by the search query (immunotherapy and cancer and "tumor microenvironment"). The MeSH Net generated by the method is in good agreement with the actual research activities in the research area (immunotherapy). Conclusion: A prototype application generating MeSH Net was developed. The application, which could be used as a "guide map for travelers", allows the users to quickly and easily acquire the knowledge of research trends. Combination of PubMed and MeSH Net is expected to be an effective complementary system for the researchers in biomedical field experiencing difficulties with search and information analysis.
Collapse
|
9
|
Peng Y, Bonifield G, Smalheiser NR. Gaps within the Biomedical Literature: Initial Characterization and Assessment of Strategies for Discovery. Front Res Metr Anal 2017; 2:3. [PMID: 29271976 PMCID: PMC5736374 DOI: 10.3389/frma.2017.00003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Within well-established fields of biomedical science, we identify "gaps", topical areas of investigation that might be expected to occur but are missing. We define a field by carrying out a topical PubMed query, and analyze Medical Subject Headings by which the set of retrieved articles are indexed. Medical Subject headings (MeSH terms) which occur in >1% of the articles are examined pairwise to see how often they are predicted to co-occur within individual articles (assuming that they are independent of each other). A pair of MeSH terms that are predicted to co-occur in at least 10 articles, yet are not observed to co-occur in any article, are "gaps" and were studied further in a corpus of 10 disease-related article sets and 10 related to biological processes. Overall, articles that filled gaps were cited more heavily than non-gap-filling articles and were 61% more likely to be published in multidisciplinary high-impact journals. Nine different features of these "gaps" were characterized and tested to learn which, if any, correlate with the appearance of one or more articles containing both MeSH terms within the next five years. Several different types of gaps were identified, each having distinct combinations of predictive features: a) those arising as a byproduct of MeSH indexing rules; b) those having little biological meaning; c) those representing "low hanging fruit" for immediate exploitation; and d) those representing gaps across disciplines or sub-disciplines that do not talk to each other or work together. We have built a free, open tool called "Mine the Gap!" that identifies and characterizes the "gaps" for any PubMed query, which can be accessed via the Anne O'Tate value-added PubMed search interface (http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/AnneOTate.cgi).
Collapse
Affiliation(s)
- Yufang Peng
- School of Information Management, Nanjing University, Nanjing, China
| | - Gary Bonifield
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL 60612 USA
| | - Neil R. Smalheiser
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL 60612 USA
| |
Collapse
|
10
|
The proportion of cancer-related entries in PubMed has increased considerably; is cancer truly "The Emperor of All Maladies"? PLoS One 2017; 12:e0173671. [PMID: 28282418 PMCID: PMC5345838 DOI: 10.1371/journal.pone.0173671] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 02/26/2017] [Indexed: 02/07/2023] Open
Abstract
In this work, the public database of biomedical literature PubMed was mined using queries with combinations of keywords and year restrictions. It was found that the proportion of Cancer-related entries per year in PubMed has risen from around 6% in 1950 to more than 16% in 2016. This increase is not shared by other conditions such as AIDS, Malaria, Tuberculosis, Diabetes, Cardiovascular, Stroke and Infection some of which have, on the contrary, decreased as a proportion of the total entries per year. Organ-related queries were performed to analyse the variation of some specific cancers. A series of queries related to incidence, funding, and relationship with DNA, Computing and Mathematics, were performed to test correlation between the keywords, with the hope of elucidating the cause behind the rise of Cancer in PubMed. Interestingly, the proportion of Cancer-related entries that contain “DNA”, “Computational” or “Mathematical” have increased, which suggests that the impact of these scientific advances on Cancer has been stronger than in other conditions. It is important to highlight that the results obtained with the data mining approach here presented are limited to the presence or absence of the keywords on a single, yet extensive, database. Therefore, results should be observed with caution. All the data used for this work is publicly available through PubMed and the UK’s Office for National Statistics. All queries and figures were generated with the software platform Matlab and the files are available as supplementary material.
Collapse
|
11
|
Rodriguez RW. Comparison of indexing times among articles from medical, nursing, and pharmacy journals. Am J Health Syst Pharm 2017; 73:569-75. [PMID: 27045069 DOI: 10.2146/ajhp150319] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
PURPOSE Results of an analysis of the times to indexing of articles published in medical, nursing, and pharmacy journals are reported. METHODS MEDLINE data were retrieved for articles published in selected general practice medical, nursing, and pharmacy journals and entered into the PubMed system in 2012 and 2013. Collected data included PubMed entry date, date of indexing with Medical Subject Headings (MeSH) terms, and publication characteristics. Survival analysis was performed to assess the primary outcome of time to indexing. Cox proportional hazards models were developed to assess the effect of healthcare discipline and source journal on the primary outcome. RESULTS Data were collected for 19,259 articles, of which 78.7%, 12.6%, and 8.7% originated from medical, nursing, and pharmacy journals, respectively. For medical, pharmacy, and nursing journals, 97.8%, 90.8%, and 50.1% of articles, respectively, were indexed within one year of PubMed entry; the corresponding median (interquartile range) times to indexing were 52 (20-68), 186 (150-246), and 252 (168-301) days. Unadjusted hazard ratios derived from Cox models indicated that indexing within one year was significantly less likely for articles published in pharmacy or nursing journals versus medical journals and for articles from all evaluated journals versus a designated reference publication (New England Journal of Medicine). CONCLUSION Analysis of major medical, nursing, and pharmacy journals found that articles from nursing and pharmacy journals were indexed with MeSH terms more slowly than articles from medical journals. Journal identity was significantly associated with time to indexing.
Collapse
Affiliation(s)
- Ryan W Rodriguez
- College of Pharmacy, University of Illinois at Chicago, Chicago, IL.
| |
Collapse
|
12
|
Kim S, Yeganova L, Wilbur WJ. Meshable: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms. Bioinformatics 2016; 32:3044-6. [PMID: 27288493 PMCID: PMC5039918 DOI: 10.1093/bioinformatics/btw331] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2015] [Accepted: 05/22/2016] [Indexed: 11/17/2022] Open
Abstract
Summary: Medical Subject Headings (MeSH®) is a controlled vocabulary for indexing and searching biomedical literature. MeSH terms and subheadings are organized in a hierarchical structure and are used to indicate the topics of an article. Biologists can use either MeSH terms as queries or the MeSH interface provided in PubMed® for searching PubMed abstracts. However, these are rarely used, and there is no convenient way to link standardized MeSH terms to user queries. Here, we introduce a web interface which allows users to enter queries to find MeSH terms closely related to the queries. Our method relies on co-occurrence of text words and MeSH terms to find keywords that are related to each MeSH term. A query is then matched with the keywords for MeSH terms, and candidate MeSH terms are ranked based on their relatedness to the query. The experimental results show that our method achieves the best performance among several term extraction approaches in terms of topic coherence. Moreover, the interface can be effectively used to find full names of abbreviations and to disambiguate user queries. Availability and Implementation:https://www.ncbi.nlm.nih.gov/IRET/MESHABLE/ Contact:sun.kim@nih.gov Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sun Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Lana Yeganova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - W John Wilbur
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
13
|
Yu Z, Bernstam E, Cohen T, Wallace BC, Johnson TR. Improving the utility of MeSH® terms using the TopicalMeSH representation. J Biomed Inform 2016; 61:77-86. [PMID: 27001195 PMCID: PMC4893983 DOI: 10.1016/j.jbi.2016.03.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Revised: 03/16/2016] [Accepted: 03/17/2016] [Indexed: 11/30/2022]
Abstract
OBJECTIVE To evaluate whether vector representations encoding latent topic proportions that capture similarities to MeSH terms can improve performance on biomedical document retrieval and classification tasks, compared to using MeSH terms. MATERIALS AND METHODS We developed the TopicalMeSH representation, which exploits the 'correspondence' between topics generated using latent Dirichlet allocation (LDA) and MeSH terms to create new document representations that combine MeSH terms and latent topic vectors. We used 15 systematic drug review corpora to evaluate performance on information retrieval and classification tasks using this TopicalMeSH representation, compared to using standard encodings that rely on either (1) the original MeSH terms, (2) the text, or (3) their combination. For the document retrieval task, we compared the precision and recall achieved by ranking citations using MeSH and TopicalMeSH representations, respectively. For the classification task, we considered three supervised machine learning approaches, Support Vector Machines (SVMs), logistic regression, and decision trees. We used these to classify documents as relevant or irrelevant using (independently) MeSH, TopicalMeSH, Words (i.e., n-grams extracted from citation titles and abstracts, encoded via bag-of-words representation), a combination of MeSH and Words, and a combination of TopicalMeSH and Words. We also used SVM to compare the classification performance of tf-idf weighted MeSH terms, LDA Topics, a combination of Topics and MeSH, and TopicalMeSH to supervised LDA's classification performance. RESULTS For the document retrieval task, using the TopicalMeSH representation resulted in higher precision than MeSH in 11 of 15 corpora while achieving the same recall. For the classification task, use of TopicalMeSH features realized a higher F1 score in 14 of 15 corpora when used by SVMs, 12 of 15 corpora using logistic regression, and 12 of 15 corpora using decision trees. TopicalMeSH also had better document classification performance on 12 of 15 corpora when compared to Topics, tf-idf weighted MeSH terms, and a combination of Topics and MeSH using SVMs. Supervised LDA achieved the worst performance in most of the corpora. CONCLUSION The proposed TopicalMeSH representation (which combines MeSH terms with latent topics) consistently improved performance on document retrieval and classification tasks, compared to using alternative standard representations using MeSH terms alone, as well as, several standard alternative approaches.
Collapse
Affiliation(s)
- Zhiguo Yu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Elmer Bernstam
- School of Biomedical Informatics and Department of Internal Medicine, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Trevor Cohen
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Byron C Wallace
- School of Information, University of Texas at Austin, Austin, TX, USA
| | - Todd R Johnson
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
14
|
Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation. JOURNAL OF BIOMEDICAL DISCOVERY AND COLLABORATION 2016; 7:e1. [PMID: 27213780 PMCID: PMC4845330 DOI: 10.5210/disco.v7i0.6654] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 04/01/2016] [Indexed: 11/24/2022]
Abstract
In the present paper, we have created and characterized several similarity
metrics for relating any two Medical Subject Headings (MeSH terms) to each
other. The article-based metric measures the tendency of two MeSH terms to
appear in the MEDLINE record of the same article. The author-based metric
measures the tendency of two MeSH terms to appear in the body of articles
written by the same individual (using the 2009 Author-ity author name
disambiguation dataset as a gold standard). The two metrics are only modestly
correlated with each other (r = 0.50), indicating that they capture different
aspects of term usage. The article-based metric provides a measure of semantic
relatedness, and MeSH term pairs that co-occur more often than expected by
chance may reflect relations between the two terms. In contrast, the author
metric is indicative of how individuals practice science, and may have value for
author name disambiguation and studies of scientific discovery. We have
calculated article metrics for all MeSH terms appearing in at least 25 articles
in MEDLINE (as of 2014) and author metrics for MeSH terms published as of 2009.
The dataset is freely available for download and can be queried at http://arrowsmith.psych.uic.edu/arrowsmith_uic/mesh_pair_metrics.html.
Handling editor: Elizabeth Workman, MLIS, PhD.
Collapse
|
15
|
Cleverley PH, Burnett S. Retrieving haystacks: a data driven information needs model for faceted search. J Inf Sci 2014. [DOI: 10.1177/0165551514554522] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The research aim was to develop an understanding of information need characteristics for word co-occurrence-based search result filters (facets). No prior research has been identified into what enterprise searchers may find useful for exploratory search and why. Various word co-occurrence techniques were applied to results from sample queries performed on industry membership content. The results were used in an international survey of 54 practising petroleum engineers from 32 organizations. Subject familiarity, job role, personality and query specificity are possible causes for survey response variation. An information needs model is presented: Broad, Rich, Intriguing, Descriptive, General, Expert and Situational (BRIDGES). This may help professionals to more effectively meet their information needs and stimulate new needs, improving a system’s ability to facilitate serendipity. This research has implications for faceted search in enterprise search and digital library deployments.
Collapse
|
16
|
Rodriguez RW. Delay in indexing articles published in major pharmacy practice journals. Am J Health Syst Pharm 2014; 71:321-4. [DOI: 10.2146/ajhp130421] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
17
|
Mirel B, Song J, Tonks JS, Meng F, Xuan W, Ameziane R. Studying PubMed usages in the field for complex problem solving: Implications for tool design. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY : JASIST 2013; 64. [PMID: 24376375 DOI: 10.1002/asi.22796] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Many recent studies on MEDLINE-based information seeking have shed light on scientists' behaviors and associated tool innovations that may improve efficiency and effectiveness. Few if any studies, however, examine scientists' problem-solving uses of PubMed in actual contexts of work and corresponding needs for better tool support. Addressing this gap, we conducted a field study of novice scientists (14 upper level undergraduate majors in molecular biology) as they engaged in a problem solving activity with PubMed in a laboratory setting. Findings reveal many common stages and patterns of information seeking across users as well as variations, especially variations in cognitive search styles. Based on findings, we suggest tool improvements that both confirm and qualify many results found in other recent studies. Our findings highlight the need to use results from context-rich studies to inform decisions in tool design about when to offer improved features to users.
Collapse
Affiliation(s)
- Barbara Mirel
- School of Education, University of Michigan, Ann Arbor, Michigan 48109 734-332-8969
| | - Jean Song
- Health Sciences Library, University of Michigan, Ann Arbor, Michigan 48109 734-936-1401
| | | | - Fan Meng
- Psychiatry Department, University of Michigan Medical School, Ann Arbor, Michigan 734-615-7099
| | - Weijian Xuan
- Psychiatry Department, University of Michigan Medical School, Ann Arbor, Michigan 734-615-7009
| | - Rafiqa Ameziane
- Molecular, Cellular, and Developmental Biology Department, University of Michigan 48109 734-764-7427
| |
Collapse
|