1
|
Yeung AWK. A revisit to the specification of sub-datasets and corresponding coverage timespans when using Web of Science Core Collection. Heliyon 2023; 9:e21527. [PMID: 38027607 PMCID: PMC10665658 DOI: 10.1016/j.heliyon.2023.e21527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/22/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023] Open
Abstract
Many papers used the Web of Science Core Collection (WOSCC) as the data source. The work by Liu (2019, Scientometrics) revealed that only 48 % of such papers published in information science and library science journals during 2017-2018 specified which sub-datasets they used, and subsequently urged researchers to provide such information together with the corresponding coverage timespans to improve transparency and reproducibility. This work revisited this issue to reveal if the current condition has improved following Liu's recommendations. Using WOSCC, 934 bibliometric open-access papers published during 2020-2022 in non-information science and library science journals were evaluated. Of these 934 papers, 45.0 % specified the sub-datasets of WOSCC they used for data collection, and 4.8 % specified the coverage timespan(s) of corresponding sub-datasets or the overall dataset. There seemed to be no improvement in the specification of data source using WOSCC.
Collapse
Affiliation(s)
- Andy Wai Kan Yeung
- Oral and Maxillofacial Radiology, Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| |
Collapse
|
2
|
Collaboration prediction based on multilayer all-author tripartite citation networks: A case study of gene editing. J Informetr 2023. [DOI: 10.1016/j.joi.2022.101374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
3
|
Orduña-Malea E, Aguillo IF. Can we use link-based indicators to find highly cited publications? The case of the Trust Flow score. J Inf Sci 2022. [DOI: 10.1177/01655515221141032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The Majestic’s Trust Flow (TF) is a link-based score aimed at measuring the influence of online objects (e.g. scientific publications) by considering the weighted number of links received from trusted websites. This study describes the bibliographic characteristics and impact of those publications with the highest TF score. In order to do this, 20,810 URL-based Digital Object Identifiers (DOIs) were identified and analysed. The results show that these DOIs mainly represent recent publications (57.1% of publications were published between 2010 and 2020), journal articles (93.75%) published in the first SCImago Journal Rank (SJR) quartile (81.7%), written with international collaboration (40.4%) and biased towards the field of medicine (36.9%). While the TF score is a discovering tool with the potential to be used in webometric studies to find influential publications, a few technical limitations jeopardise the general applicability of this indicator for research evaluation at the publication level.
Collapse
Affiliation(s)
- Enrique Orduña-Malea
- Department of Audiovisual Communication, Documentation and History of Art, Universitat Politècnica de València, Spain
| | - Isidro F Aguillo
- Cybermetrics Lab, Institute of Public Goods and Policies (IPP), Spanish National Research Council (CSIC), Spain
| |
Collapse
|
4
|
Xu S, Li L, Wang C, An X, Yang G. An improved author-topic (AT) model with authorship credit allocation schemes. J Inf Sci 2022. [DOI: 10.1177/01655515221133530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Authorship credit allocation schemes have attracted considerable research attention. However, no consensus about which one is the best has been attained until now, and limited evidence from practical tasks has been reported. Therefore, this study uses the author interest discovery task as a real-world task case to provide valuable insights into authorship credit allocation schemes and guidelines for further practical applications. For this purpose, a novel model, ATcredit, is proposed to strengthen the Author-Topic (AT) model with an authorship credit allocation scheme, and collapsed Gibbs sampling is used to approximate the posterior and estimate model parameters. Extensive experiments using the SynBio dataset reveal several interesting findings as follows. (a) Any scheme for allocating unequal authorship credits performs better than its equal-credit counterpart with our ATcredit model in terms of perplexity. (b) The fixed versions of four out of the six schemes work better than their flexible counterparts with our ATcredit model, regardless of the hyper-authorship strategy. (c) The variation coefficient of credit awards can serve as a criterion to decide whether the hyper-authorship strategy should be used. (d) When the number of authors in a scholarly article is less than three, the six authorship credit allocation schemes are similar to each other with our ATcredit model in terms of perplexity. (e) The harmonic counting scheme performs the best, followed by the arithmetic counting scheme, and the network-based counting scheme performs the worst with our ATcredit model in terms of perplexity. (f) The arithmetic counting scheme is similar to the harmonic counting scheme in terms of the normalised mutual information (NMI) of discovered interests, but the geometric counting scheme is different from the axiomatic and network-based counting schemes.
Collapse
Affiliation(s)
- Shuo Xu
- College of Economics and Management, Beijing University of Technology, P.R. China
| | - Ling Li
- College of Economics and Management, Beijing University of Technology, P.R. China
| | - Congcong Wang
- College of Economics and Management, Beijing University of Technology, P.R. China
| | - Xin An
- School of Economics and Management, Beijing Forestry University, P.R. China
| | - Guancan Yang
- School of Information Resource Management, Renmin University of China, P.R. China
| |
Collapse
|
5
|
The quality of the web of science data: a longitudinal study on the completeness of authors-addresses links. Scientometrics 2022. [DOI: 10.1007/s11192-022-04525-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
6
|
An X, Zhang M, Xu S. An active learning-based approach for screening scholarly articles about the origins of SARS-CoV-2. PLoS One 2022; 17:e0273725. [PMID: 36112646 PMCID: PMC9480989 DOI: 10.1371/journal.pone.0273725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 08/13/2022] [Indexed: 11/17/2022] Open
Abstract
To build a full picture of previous studies on the origins of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), this paper exploits an active learning-based approach to screen scholarly articles about the origins of SARS-CoV-2 from many scientific publications. In more detail, six seed articles were utilized to manually curate 170 relevant articles and 300 nonrelevant articles. Then, an active learning-based approach with three query strategies and three base classifiers is trained to screen the articles about the origins of SARS-CoV-2. Extensive experimental results show that our active learning-based approach outperforms traditional counterparts, and the uncertain sampling query strategy performs best among the three strategies. By manually checking the top 1,000 articles of each base classifier, we ultimately screened 715 unique scholarly articles to create a publicly available peer-reviewed literature corpus, COVID-Origin. This indicates that our approach for screening articles about the origins of SARS-CoV-2 is feasible.
Collapse
Affiliation(s)
- Xin An
- School of Economics & Management, Beijing Forestry University, Beijing, P.R. China
| | - Mengmeng Zhang
- School of Economics & Management, Beijing Forestry University, Beijing, P.R. China
| | - Shuo Xu
- College of Economics and Management, Beijing University of Technology, Beijing, P.R. China
- * E-mail:
| |
Collapse
|
7
|
Xu S, Wang C, An X, Hao L, Yang G. A novel developmental trajectory discovery approach by integrating main path analysis and intermediacy. J Inf Sci 2022. [DOI: 10.1177/01655515221101835] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
As a widely used technique for discovering developmental trajectory of a specific field of science and technology, main path analysis armed with global search strategy prefers longer citation paths rather than shorter ones. An obvious feature of longer main paths is that the theme of documents may not be so coherent, though longer paths may provide more details on the development of a field than shorter ones. Thereupon, a new measure, named as intermediacy, was proposed in the literature for recognising important scientific publications. However, the intermediacy is only applicable to the citation network with one single target node and one single source node. For purpose of loosening this limitation of the intermediacy and benefitting from main path analysis and intermediacy, this work raises an alternative approach for discovering developmental trajectory by combining node importance and edge importance via edge and node integrated modes. Extensive experimental results on the weak signals and education fields indicate that similar trajectories can be obtained through these two integrated modes, and richer implications can be encoded in our discovered trajectories than those from main path analysis and intermediacy. In addition, our framework is able to scale very well to a large citation network.
Collapse
Affiliation(s)
- Shuo Xu
- College of Economics and Management, Beijing University of Technology, P.R. China
| | - Congcong Wang
- College of Economics and Management, Beijing University of Technology, P.R. China
| | - Xin An
- School of Economics and Management, Beijing Forestry University, P.R. China
| | - Liyuan Hao
- College of Economics and Management, Beijing University of Technology, P.R. China
| | - Guancan Yang
- School of Information Resource Management, Renmin University of China, P.R. China
| |
Collapse
|
8
|
Cioffi A, Coppini S, Massari A, Moretti A, Peroni S, Santini C, Shahidzadeh Asadi N. Identifying and correcting invalid citations due to DOI errors in Crossref data. Scientometrics 2022. [DOI: 10.1007/s11192-022-04367-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractThis work aims to identify classes of DOI mistakes by analysing the open bibliographic metadata available in Crossref, highlighting which publishers were responsible for such mistakes and how many of these incorrect DOIs could be corrected through automatic processes. By using a list of invalid cited DOIs gathered by OpenCitations while processing the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) in the past two years, we retrieved the citations in the January 2021 Crossref dump to such invalid DOIs. We processed these citations by keeping track of their validity and the publishers responsible for uploading the related citation data in Crossref. Finally, we identified patterns of factual errors in the invalid DOIs and the regular expressions needed to catch and correct them. The outcomes of this research show that only a few publishers were responsible for and/or affected by the majority of invalid citations. We extended the taxonomy of DOI name errors proposed in past studies and defined more elaborated regular expressions that can clean a higher number of mistakes in invalid DOIs than prior approaches. The data gathered in our study can enable investigating possible reasons for DOI mistakes from a qualitative point of view, helping publishers identify the problems underlying their production of invalid citation data. Also, the DOI cleaning mechanism we present could be integrated into the existing process (e.g. in COCI) to add citations by automatically correcting a wrong DOI. This study was run strictly following Open Science principles, and, as such, our research outcomes are fully reproducible.
Collapse
|
9
|
Purnell PJ. The prevalence and impact of university affiliation discrepancies between four bibliographic databases – Scopus, Web of Science, Dimensions, and Microsoft Academic. QUANTITATIVE SCIENCE STUDIES 2022. [DOI: 10.1162/qss_a_00175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Abstract
Research managers benchmarking universities against international peers face the problem of affiliation disambiguation. Different databases have taken separate approaches to this problem and discrepancies exist between them. Bibliometric data sources typically conduct a disambiguation process that unifies variant institutional names and those of its sub-units so that researchers can then search all records from that institution using a single unified name. This study examined affiliation discrepancies between Scopus, Web of Science, Dimensions, and Microsoft Academic for 18 Arab universities over a five-year period. We confirmed that digital object identifiers (DOIs) are suitable for extracting comparable scholarly material across databases and quantified the affiliation discrepancies between them. A substantial share of records assigned to the selected universities in any one database were not assigned to the same university in another. The share of discrepancy was higher in the larger databases, Dimensions and Microsoft Academic. The smaller, more selective databases, Scopus and especially Web of Science tended to agree to a greater degree with affiliations in the other databases. Manual examination of affiliation discrepancies showed they were caused by a mixture of missing affiliations, unification differences, and assignation of records to the wrong institution.
Peer Review
https://publons.com/publon/10.1162/qss_a_00175
Collapse
Affiliation(s)
- Philip J. Purnell
- Centre for Science and Technology Studies, Leiden University, P.O. Box 905, 2300 AX Leiden, The Netherlands
- United Arab Emirates University, Al Ain, UAE
| |
Collapse
|
10
|
Xu S, Li L, An X, Hao L, Yang G. An approach for detecting the commonality and specialty between scientific publications and patents. Scientometrics 2021. [DOI: 10.1007/s11192-021-04085-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
11
|
Do scientific publications by editorial board members have shorter publication delays and then higher influence? Scientometrics 2021. [DOI: 10.1007/s11192-021-04067-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
12
|
Liu J. Digital Object Identifier (DOI) and DOI Services: An Overview. LIBRI 2021. [DOI: 10.1515/libri-2020-0018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
In the establishing anniversary of the two biggest Digital Object Identifier (DOI) registration agencies all over the world, Crossref and DataCite, the paper intends to provide an overview of the development and approaches and of DOI and DOI services, from which scholarly communication has benefited greatly. At first, the author explores the initiation of DOI and differences of DOI from other persistent identifiers. After that, DOIs for different kinds of objects and DOIs’ value in enhancing scholarly communication is discussed; then, in the second part, DOI services at different levels in a pyramid and those particularly in Germany are described. The active involvement of the library world are also introduced here; finally, the current situation and prospects as well as some issues dealing with DOIs and DOI services are investigated in the last part of the paper.
Collapse
Affiliation(s)
- Jia Liu
- Department of Reseach and Development , University and City Library of Cologne , Universitaetsstr. 33 , 50931 Cologne , North Rhine-Westphalia , Germany
| |
Collapse
|
13
|
Same journal but different numbers of published records indexed in Scopus and Web of Science Core Collection: causes, consequences, and solutions. Scientometrics 2021. [DOI: 10.1007/s11192-021-03934-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
14
|
Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today’s Academic World. PUBLICATIONS 2021. [DOI: 10.3390/publications9010012] [Citation(s) in RCA: 133] [Impact Index Per Article: 44.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Nowadays, the importance of bibliographic databases (DBs) has increased enormously, as they are the main providers of publication metadata and bibliometric indicators universally used both for research assessment practices and for performing daily tasks. Because the reliability of these tasks firstly depends on the data source, all users of the DBs should be able to choose the most suitable one. Web of Science (WoS) and Scopus are the two main bibliographic DBs. The comprehensive evaluation of the DBs’ coverage is practically impossible without extensive bibliometric analyses or literature reviews, but most DBs users do not have bibliometric competence and/or are not willing to invest additional time for such evaluations. Apart from that, the convenience of the DB’s interface, performance, provided impact indicators and additional tools may also influence the users’ choice. The main goal of this work is to provide all of the potential users with an all-inclusive description of the two main bibliographic DBs by gathering the findings that are presented in the most recent literature and information provided by the owners of the DBs at one place. This overview should aid all stakeholders employing publication and citation data in selecting the most suitable DB.
Collapse
|
15
|
What academic mobility configurations contribute to high performance: an fsQCA analysis of CSC-funded visiting scholars. Scientometrics 2020. [DOI: 10.1007/s11192-020-03783-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
16
|
Pech G, Delgado C. Assessing the publication impact using citation data from both Scopus and WoS databases: an approach validated in 15 research fields. Scientometrics 2020. [DOI: 10.1007/s11192-020-03660-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
17
|
|
18
|
Xu S, Hao L, An X, Yang G, Wang F. Emerging research topics detection with multiple machine learning models. J Informetr 2019. [DOI: 10.1016/j.joi.2019.100983] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
19
|
Wang F, Jia C, Wang X, Liu J, Xu S, Liu Y, Yang C. Exploring all-author tripartite citation networks: A case study of gene editing. J Informetr 2019. [DOI: 10.1016/j.joi.2019.08.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|