Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Y, Lu J, Liu F, Liu Q, Porter A, Chen H, Zhang G. Does deep learning help topic extraction? A kernel k-means clustering method with word embedding. J Informetr 2018. [DOI: 10.1016/j.joi.2018.09.004] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

For:	Zhang Y, Lu J, Liu F, Liu Q, Porter A, Chen H, Zhang G. Does deep learning help topic extraction? A kernel k-means clustering method with word embedding. J Informetr 2018. [DOI: 10.1016/j.joi.2018.09.004] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Porter AL, Zhang Y, Newman NC. Tech mining: a revisit and navigation. Front Res Metr Anal 2024;9:1364053. [PMID: 38741784 PMCID: PMC11089556 DOI: 10.3389/frma.2024.1364053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Accepted: 04/11/2024] [Indexed: 05/16/2024] Open

Jiang H, Fan S, Zhang N, Zhu B. Deep learning for predicting patent application outcome: The fusion of text and network embeddings. J Informetr 2023. [DOI: 10.1016/j.joi.2023.101402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]

Chen X, Ye P, Huang L, Wang C, Cai Y, Deng L, Ren H. Exploring science-technology linkages: A deep learning-empowered solution. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2022.103255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Rajagopal P, Aghris T, Fettah FE, Ravana SD. Clustering of Relevant Documents Based on Findability Effort in Information Retrieval. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH 2023. [DOI: 10.4018/ijirr.315764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Knisely BM, Pavliscsak HH. Research proposal content extraction using natural language processing and semi-supervised clustering: A demonstration and comparative analysis. Scientometrics 2023;128:3197-3224. [PMID: 37101971 PMCID: PMC10083066 DOI: 10.1007/s11192-023-04689-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 03/07/2023] [Indexed: 04/28/2023]

Abstract

Funding institutions often solicit text-based research proposals to evaluate potential recipients. Leveraging the information contained in these documents could help institutions understand the supply of research within their domain. In this work, an end-to-end methodology for semi-supervised document clustering is introduced to partially automate classification of research proposals based on thematic areas of interest. The methodology consists of three stages: (1) manual annotation of a document sample; (2) semi-supervised clustering of documents; (3) evaluation of cluster results using quantitative metrics and qualitative ratings (coherence, relevance, distinctiveness) by experts. The methodology is described in detail to encourage replication and is demonstrated on a real-world data set. This demonstration sought to categorize proposals submitted to the US Army Telemedicine and Advanced Technology Research Center (TATRC) related to technological innovations in military medicine. A comparative analysis of method features was performed, including unsupervised vs. semi-supervised clustering, several document vectorization techniques, and several cluster result selection strategies. Outcomes suggest that pretrained Bidirectional Encoder Representations from Transformers (BERT) embeddings were better suited for the task than older text embedding techniques. When comparing expert ratings between algorithms, semi-supervised clustering produced coherence ratings ~ 25% better on average compared to standard unsupervised clustering with negligible differences in cluster distinctiveness. Last, it was shown that a cluster result selection strategy that balances internal and external validity produced ideal results. With further refinement, this methodological framework shows promise as a useful analytical tool for institutions to unlock hidden insights from untapped archives and similar administrative document repositories.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11192-023-04689-3.

Collapse

Academic collaborations: a recommender framework spanning research interests and network topology. Scientometrics 2022. [DOI: 10.1007/s11192-022-04555-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Chen L, Xu S, Zhu L, Zhang J, Yang G, Xu H. A deep learning based method benefiting from characteristics of patents for semantic relation classification. J Informetr 2022. [DOI: 10.1016/j.joi.2022.101312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Huang L, Cai Y, Zhao E, Zhang S, Shu Y, Fan J. Measuring the interdisciplinarity of Information and Library Science interactions using citation analysis and semantic analysis. Scientometrics 2022. [DOI: 10.1007/s11192-022-04401-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Reviewer recommendation method for scientific research proposals: a case for NSFC. Scientometrics 2022. [DOI: 10.1007/s11192-022-04389-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Network dynamics in university-industry collaboration: a collaboration-knowledge dual-layer network perspective. Scientometrics 2022. [DOI: 10.1007/s11192-022-04330-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

A methodology for identifying breakthrough topics using structural entropy. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2021.102862] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Identification of topic evolution: network analytics with piecewise linear representation and word embedding. Scientometrics 2022. [DOI: 10.1007/s11192-022-04273-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Jin Q, Chen H, Wang X, Ma T, Xiong F. Exploring funding patterns with word embedding-enhanced organization–topic networks: a case study on big data. Scientometrics 2022. [DOI: 10.1007/s11192-021-04253-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]

How do people view COVID-19 vaccines- Analyses on tweets about COVID-19 vaccines using Natural Language Processing and Sentiment Analysis. JOURNAL OF GLOBAL INFORMATION MANAGEMENT 2022. [DOI: 10.4018/jgim.300817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Yoon B, Kim S, Kim S, Seol H. Doc2vec-based link prediction approach using SAO structures: application to patent network. Scientometrics 2021. [DOI: 10.1007/s11192-021-04187-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Zhang Y, Wu M, Miao W, Huang L, Lu J. Bi-layer network analytics: A methodology for characterizing emerging general-purpose technologies. J Informetr 2021. [DOI: 10.1016/j.joi.2021.101202] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Xie Q, Zhang X, Song M. A network embedding-based scholar assessment indicator considering four facets: Research topic, author credit allocation, field-normalized journal impact, and published time. J Informetr 2021. [DOI: 10.1016/j.joi.2021.101201] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Dynamic network analytics for recommending scientific collaborators. Scientometrics 2021. [DOI: 10.1007/s11192-021-04164-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Topic-level sentiment analysis of social media data using deep learning. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107440] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Zhao D, Strotmann A. Intellectual structure of information science 2011–2020: an author co-citation analysis. JOURNAL OF DOCUMENTATION 2021. [DOI: 10.1108/jd-06-2021-0119] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Abstract PurposeThis study continues a long history of author co-citation analysis of the intellectual structure of information science into the time period of 2011–2020. It also examines changes in this structure from 2006–2010 through 2011–2015 to 2016–2020. Results will contribute to a better understanding of the information science research field.Design/methodology/approachThe well-established procedures and techniques for author co-citation analysis were followed. Full records of research articles in core information science journals published during 2011–2020 were retrieved and downloaded from the Web of Science database. About 150 most highly cited authors in each of the two five-year time periods were selected from this dataset to represent this field, and their co-citation counts were calculated. Each co-citation matrix was input into SPSS for factor analysis, and results were visualized in Pajek. Factors were interpreted as specialties and labeled upon an examination of articles written by authors who load primarily on each factor.FindingsThe two-camp structure of information science continued to be present clearly. Bibliometric indicators for research evaluation dominated the Knowledge Domain Analysis camp during both fivr-year time periods, whereas interactive information retrieval (IR) dominated the IR camp during 2011–2015 but shared dominance with information behavior during 2016–2020. Bridging between the two camps became increasingly weaker and was only provided by the scholarly communication specialty during 2016–2020. The IR systems specialty drifted further away from the IR camp. The information behavior specialty experienced a deep slump during 2011–2020 in its evolution process. Altmetrics grew to dominate the Webometrics specialty and brought it to a sharp increase during 2016–2020.Originality/valueAuthor co-citation analysis (ACA) is effective in revealing intellectual structures of research fields. Most related studies used term-based methods to identify individual research topics but did not examine the interrelationships between these topics or the overall structure of the field. The few studies that did discuss the overall structure paid little attention to the effect of changes to the source journals on the results. The present study does not have these problems and continues the long history of benchmark contributions to a better understanding of the information science field using ACA. Collapse

A Topic Detection Method Based on Word-attention Networks. JOURNAL OF DATA AND INFORMATION SCIENCE 2021. [DOI: 10.2478/jdis-2021-0032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

CSO Classifier 3.0: a scalable unsupervised method for classifying documents in terms of research topics. INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES 2021. [DOI: 10.1007/s00799-021-00305-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Embedding-based Detection and Extraction of Research Topics from Academic Documents Using Deep Clustering. JOURNAL OF DATA AND INFORMATION SCIENCE 2021. [DOI: 10.2478/jdis-2021-0024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Abstract Abstract Purpose Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields. This also helps in having a better collaboration with governments and businesses. This study aims to investigate the development of research fields over time, translating it into a topic detection problem. Design/methodology/approach To achieve the objectives, we propose a modified deep clustering method to detect research trends from the abstracts and titles of academic documents. Document embedding approaches are utilized to transform documents into vector-based representations. The proposed method is evaluated by comparing it with a combination of different embedding and clustering approaches and the classical topic modeling algorithms (i.e. LDA) against a benchmark dataset. A case study is also conducted exploring the evolution of Artificial Intelligence (AI) detecting the research topics or sub-fields in related AI publications. Findings Evaluating the performance of the proposed method using clustering performance indicators reflects that our proposed method outperforms similar approaches against the benchmark dataset. Using the proposed method, we also show how the topics have evolved in the period of the recent 30 years, taking advantage of a keyword extraction method for cluster tagging and labeling, demonstrating the context of the topics. Research limitations We noticed that it is not possible to generalize one solution for all downstream tasks. Hence, it is required to fine-tune or optimize the solutions for each task and even datasets. In addition, interpretation of cluster labels can be subjective and vary based on the readers’ opinions. It is also very difficult to evaluate the labeling techniques, rendering the explanation of the clusters further limited. Practical implications As demonstrated in the case study, we show that in a real-world example, how the proposed method would enable the researchers and reviewers of the academic research to detect, summarize, analyze, and visualize research topics from decades of academic documents. This helps the scientific community and all related organizations in fast and effective analysis of the fields, by establishing and explaining the topics. Originality/value In this study, we introduce a modified and tuned deep embedding clustering coupled with Doc2Vec representations for topic extraction. We also use a concept extraction method as a labeling approach in this study. The effectiveness of the method has been evaluated in a case study of AI publications, where we analyze the AI topics during the past three decades. Collapse

A deep-learning based citation count prediction model with paper metadata semantic features. Scientometrics 2021. [DOI: 10.1007/s11192-021-04033-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Zhang Y, Wu M, Tian GY, Zhang G, Lu J. Ethics and privacy of artificial intelligence: Understandings from bibliometrics. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106994] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Zhang Y, Wu M, Hu Z, Ward R, Zhang X, Porter A. Profiling and predicting the problem-solving patterns in China’s research systems: A methodology of intelligent bibliometrics and empirical insights. QUANTITATIVE SCIENCE STUDIES 2021. [DOI: 10.1162/qss_a_00100] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Zhang Y, Cai X, Fry CV, Wu M, Wagner CS. Topic evolution, disruption and resilience in early COVID-19 research. Scientometrics 2021;126:4225-4253. [PMID: 33776163 PMCID: PMC7980735 DOI: 10.1007/s11192-021-03946-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 03/05/2021] [Indexed: 11/25/2022]

Chowdhury K. Functional analysis of generalized linear models under non-linear constraints with applications to identifying highly-cited papers. J Informetr 2021. [DOI: 10.1016/j.joi.2020.101112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

A deep learning framework to early identify emerging technologies in large-scale outlier patents: an empirical study of CNC machine tool. Scientometrics 2021. [DOI: 10.1007/s11192-020-03797-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Validation of the Astro dataset clustering solutions with external data. Scientometrics 2020. [DOI: 10.1007/s11192-020-03780-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Huangfu C, Zeng Y, Wang Y. Creating Neuroscientific Knowledge Organization System Based on Word Representation and Agglomerative Clustering Algorithm. Front Neuroinform 2020;14:38. [PMID: 33013345 PMCID: PMC7461893 DOI: 10.3389/fninf.2020.00038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 07/17/2020] [Indexed: 11/24/2022] Open

Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity. J Informetr 2020. [DOI: 10.1016/j.joi.2019.101004] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Zhou Y, Dong F, Liu Y, Li Z, Du J, Zhang L. Forecasting emerging technologies using data augmentation and deep learning. Scientometrics 2020. [DOI: 10.1007/s11192-020-03351-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Jiang Y. Semantifying formal concept analysis using description logics. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2019.104967] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Uncovering diffusion trends in computer science and physics publications. LIBRARY HI TECH 2019. [DOI: 10.1108/lht-07-2018-0097] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]