1
|
Lyutov A, Uygun Y, Hütt MT. Machine learning misclassification networks reveal a citation advantage of interdisciplinary publications only in high-impact journals. Sci Rep 2024; 14:21906. [PMID: 39300204 DOI: 10.1038/s41598-024-72364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 09/06/2024] [Indexed: 09/22/2024] Open
Abstract
Given a large enough volume of data and precise, meaningful categories, training a statistical model to solve a classification problem is straightforward and has become a standard application of machine learning (ML). If the categories are not precise, but rather fuzzy, as in the case of scientific disciplines, the systematic failures of ML classification can be informative about properties of the underlying categories. Here we classify a large volume of academic publications using only the abstract as information. From the publications that are classified differently by journal categories and ML categories (i.e., misclassified publications, when using the journal assignment as ground truth) we construct a network among disciplines. Analysis of these misclassifications provides insight in two topics at the core of the science of science: (1) Mapping out the interplay of disciplines. We show that this misclassification network is informative about the interplay of academic disciplines and it is similar to, but distinct from, a citation-based map of science, where nodes are scientific disciplines and an edge indicates a strong co-citation count between publications in these disciplines. (2) Analyzing the success of interdisciplinarity. By evaluating the citation patterns of publications, we show that misclassification can be linked to interdisciplinarity and, furthermore, that misclassified articles have different citation frequencies than correctly classified articles: In the highest 10 percent of journals in each discipline, these misclassified articles are on average cited more frequently, while in the rest of the journals they are cited less frequently.
Collapse
Affiliation(s)
- Alexey Lyutov
- School of Business, Social and Decision Science, Constructor University, 28759, Bremen, Germany
| | - Yilmaz Uygun
- School of Business, Social and Decision Science, Constructor University, 28759, Bremen, Germany
| | | |
Collapse
|
2
|
Montesinos-López A, Gutiérrez-Pulido H, Ramos-Pulido S, Montesinos-López JC, Montesinos-López OA, Crossa J. Bayesian discrete lognormal regression model for genomic prediction. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:21. [PMID: 38221602 DOI: 10.1007/s00122-023-04526-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 12/11/2023] [Indexed: 01/16/2024]
Abstract
KEY MESSAGE Genomic prediction models for quantitative traits assume continuous and normally distributed phenotypes. In this research, we proposed a novel Bayesian discrete lognormal regression model. Genomic selection is a powerful tool in modern breeding programs that uses genomic information to predict the performance of individuals and select those with desirable traits. It has revolutionized animal and plant breeding, as it allows breeders to identify the best candidates without labor-intensive and time-consuming phenotypic evaluations. While several statistical models have been developed, most of them have been for quantitative continuous traits and only a few for count responses. In this paper, we propose a discrete lognormal regression model in the Bayesian context, that with a Gibbs sampler to explore the corresponding posterior distribution and make the predictions. Two datasets of resistance disease is used in the wheat crop and are then evaluated against the traditional Gaussian model and a lognormal model. The results indicate the proposed model is a competitive and natural model for predicting count genomic traits.
Collapse
Affiliation(s)
- Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, C. P. 44430, Guadalajara, Jalisco, México
| | - Humberto Gutiérrez-Pulido
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, C. P. 44430, Guadalajara, Jalisco, México
| | - Sofía Ramos-Pulido
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, C. P. 44430, Guadalajara, Jalisco, México
| | | | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz Km. 45, El Batán, C. P. 56237, Texcoco, Edo. de México, México.
- Colegio de Postgraduados, C. P. 56230, Montecillos, Edo. de México, México.
- Centre for Crop & Food Innovation, Food Futures Institute, Murdoch University, Murdoch, 6150, Australia.
| |
Collapse
|
3
|
Ke Q, Gates AJ, Barabási AL. A network-based normalized impact measure reveals successful periods of scientific discovery across discipline. Proc Natl Acad Sci U S A 2023; 120:e2309378120. [PMID: 37983494 PMCID: PMC10691329 DOI: 10.1073/pnas.2309378120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Accepted: 10/19/2023] [Indexed: 11/22/2023] Open
Abstract
The impact of a scientific publication is often measured by the number of citations it receives from the scientific community. However, citation count is susceptible to well-documented variations in citation practices across time and discipline, limiting our ability to compare different scientific achievements. Previous efforts to account for citation variations often rely on a priori discipline labels of papers, assuming that all papers in a discipline are identical in their subject matter. Here, we propose a network-based methodology to quantify the impact of an article by comparing it with locally comparable research, thereby eliminating the discipline label requirement. We show that the developed measure is not susceptible to discipline bias and follows a universal distribution for all articles published in different years, offering an unbiased indicator for impact across time and discipline. We then use the indicator to identify science-wide high impact research in the past half century and quantify its temporal production dynamics across disciplines, helping us identifying breakthroughs from diverse, smaller disciplines, such as geosciences, radiology, and optics, as opposed to citation-rich biomedical sciences. Our work provides insights into the evolution of science and paves a way for fair comparisons of the impact of diverse contributions across many fields.
Collapse
Affiliation(s)
- Qing Ke
- School of Data Science, City University of Hong Kong, Hong Kong, China
| | - Alexander J. Gates
- School of Data Science, University of Virginia, Charlottesville, VA22904
| | - Albert-László Barabási
- Network Science Institute, Northeastern University, Boston, MA02115
- Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA02115
- Department of Network and Data Science, Central European University, Budapest1051, Hungary
| |
Collapse
|
4
|
Krauss A, Danús L, Sales-Pardo M. Early-career factors largely determine the future impact of prominent researchers: evidence across eight scientific fields. Sci Rep 2023; 13:18794. [PMID: 37914796 PMCID: PMC10620415 DOI: 10.1038/s41598-023-46050-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 10/26/2023] [Indexed: 11/03/2023] Open
Abstract
Can we help predict the future impact of researchers using early-career factors? We analyze early-career factors of the world's 100 most prominent researchers across 8 scientific fields and identify four key drivers in researchers' initial career: working at a top 25 ranked university, publishing a paper in a top 5 ranked journal, publishing most papers in top quartile (high-impact) journals and co-authoring with other prominent researchers in their field. We find that over 95% of prominent researchers across multiple fields had at least one of these four features in the first 5 years of their career. We find that the most prominent scientists who had an early career advantage in terms of citations and h-index are more likely to have had all four features, and that this advantage persists throughout their career after 10, 15 and 20 years. Our findings show that these few early-career factors help predict researchers' impact later in their careers. Our research thus points to the need to enhance fairness and career mobility among scientists who have not had a jump start early on.
Collapse
Affiliation(s)
- Alexander Krauss
- London School of Economics, London, UK.
- Institute for Economic Analysis, Spanish National Research Council, Barcelona, Spain.
| | - Lluís Danús
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain
| | - Marta Sales-Pardo
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain.
| |
Collapse
|
5
|
Nurrochman A, Junianto E, Korda AA, Prawara B, Basuki EA. Research hotspots and future trends of hot corrosion research: a bibliometric analysis. RSC Adv 2023; 13:29904-29922. [PMID: 37842671 PMCID: PMC10574801 DOI: 10.1039/d3ra04628a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/25/2023] [Indexed: 10/17/2023] Open
Abstract
Hot corrosion has attracted researchers due to its complexity of mechanisms leading to a critical challenge for energy efficiency advancement. Literature on hot corrosion spans a wide range of discussions in materials, including metals or non-metals and operating environmental conditions. Hence it was difficult to overshadow the current status and future trends of hot corrosion research. Here we pioneered a bibliometric analysis to identify the research hotspot and possible anticipated future direction of the hot corrosion study. The results showed that at least six research hotspots can be derived after carefully classifying hot corrosion research literature based on their discussion and key findings. Some hotspots were inactive in recent years and brought complications in research direction prediction. Nevertheless, several future trends of hot corrosion research are suggested. This study provides beneficial ideas in enlightening hot corrosion research development.
Collapse
Affiliation(s)
- Andrieanto Nurrochman
- Metallurgical Engineering Research Group, Faculty of Mining and Petroleum Engineering, Institut Teknologi Bandung Jl. Ganesha, 10 Bandung 40132 Jawa Barat Indonesia
| | - Endro Junianto
- Research Center for Advanced Material, National Research and Innovation Agency (BRIN) Serpong 15314 Indonesia
| | - Akhmad Ardian Korda
- Metallurgical Engineering Research Group, Faculty of Mining and Petroleum Engineering, Institut Teknologi Bandung Jl. Ganesha, 10 Bandung 40132 Jawa Barat Indonesia
| | - Budi Prawara
- Research Center for Advanced Material, National Research and Innovation Agency (BRIN) Serpong 15314 Indonesia
| | - Eddy Agus Basuki
- Metallurgical Engineering Research Group, Faculty of Mining and Petroleum Engineering, Institut Teknologi Bandung Jl. Ganesha, 10 Bandung 40132 Jawa Barat Indonesia
| |
Collapse
|
6
|
Abstract
Underwater photosynthesis is the most important metabolic activity for submerged plants since it could utilize carbon fixation to replenish lost carbohydrates and improve internal aeration by producing O2. The present study used bibliometric methods to quantify the annual number of publications related to underwater photosynthesis. CiteSpace, as a visual analytic software for the literature, was employed to analyze the distribution of the subject categories, author collaborations, institution collaborations, international (regional) collaborations, and cocitation and keyword burst. The results show the basic characteristics of the literature, the main intellectual base, and the main research powers of underwater photosynthesis. Meanwhile, this paper revealed the research hotspots and trends of this field. This study provides an objective and comprehensive analysis of underwater photosynthesis from a bibliometric perspective. It is expected to provide reference information for scholars in related fields to refine the research direction, solve specific scientific problems, and assist scholars in seeking/establishing relevant collaborations in their areas of interest.
Collapse
|
7
|
Abstract
AbstractThe world’s largest community of scientists disintegrated following the dissolution of the Soviet Union. With extremely scarce resources and limited academic freedom as starting points, researchers in this region have been creating new knowledge; they have been building on rich scientific traditions in selected disciplines and, at times, paving new paths in non-traditional disciplines. At present, the cumulative contribution of post-Soviet countries to global research output is only three percent, indicating that these countries are not key players on the global research scene. This study uses bibliometric methods to offer novel empirical insight into the quantity and impact of academic publications; it also looks at the quality of journals in which the output is published. The findings reveal that fifteen post-Soviet countries differ considerably in terms of how much they have prioritised research, as well as the quantity, quality, and impact of their publications. The research productivity across the region has not been high and, taken together, these countries have produced publications of considerably lower quality and lower impact when viewed in the context of global research output. At the same time, researchers from post-Soviet countries tap into international collaborative networks actively, resulting in an exceptionally large proportion of publications from this region being internationally co-authored. In the historical context of Soviet research being known as one of the least collaborative globally, this finding indicates that researchers in the region are attractive to international collaborators and may be seeking such partnerships due to relatively modest research capacity at home.
Collapse
|
8
|
Lyutov A, Uygun Y, Hütt MT. Machine learning misclassification of academic publications reveals non-trivial interdependencies of scientific disciplines. Scientometrics 2020. [DOI: 10.1007/s11192-020-03789-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AbstractExploring the production of knowledge with quantitative methods is the foundation of scientometrics. In an application of machine learning to scientometrics, we here consider the classification problem of the mapping of academic publications to the subcategories of a multidisciplinary journal—and hence to scientific disciplines—based on the information contained in the abstract. In contrast to standard classification tasks, we are not interested in maximizing the accuracy, but rather we ask, whether the failures of an automatic classification are systematic and contain information about the system under investigation. These failures can be represented as a ’misclassification network’ inter-relating scientific disciplines. Here we show that this misclassification network (1) gives a markedly different pattern of interdependencies among scientific disciplines than common ’maps of science’, (2) reveals a statistical association between misclassification and citation frequencies, and (3) allows disciplines to be classified as ’method lenders’ and ’content explorers’, based on their in-degree out-degree asymmetry. On a more general level, in a wide range of machine learning applications misclassification networks have the potential of extracting systemic information from the failed classifications, thus allowing to visualize and quantitatively assess those aspects of a complex system, which are not machine learnable.
Collapse
|
9
|
Köseoglu MA, Parnell JA, Yick MYY. Identifying influential studies and maturity level in intellectual structure of fields: evidence from strategic management. Scientometrics 2020. [DOI: 10.1007/s11192-020-03776-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
10
|
Abstract
The present study used bibliometric methods to analyze the literature regarding the biochar effects on soil that are included in the Web of Science Core Collection database and quantified the annual number of publications in the field and distribution of publications. Using CiteSpace as a visual analytic software for the literature, the distribution of the subject categories, author collaborations, institution collaborations, international (regional) collaborations, and cocitation and keyword clustering were analyzed. The results showed the basic characteristics of the literature related to the effects of biochar on soil. Furthermore, the main research powers in this field were identified. Then, we recognized the main intellectual base in the domain of biochar effects on soil. Meanwhile, this paper revealed the research hotspots and trends of this field. Furthermore, focuses of future research in this field are discussed. The present study quantitatively and objectively describes the research status and trends of biochar effects on soil from the bibliometric perspective to promote in-depth research in this field and provide reference information for scholars in the relevant fields to refine their research directions, address specific scientific issues, and help scholars to seek/establish relevant collaborations in their fields of interests.
Collapse
|
11
|
Towards a More Realistic Citation Model: The Key Role of Research Team Sizes. ENTROPY 2020; 22:e22080875. [PMID: 33286646 PMCID: PMC7517479 DOI: 10.3390/e22080875] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 08/06/2020] [Accepted: 08/08/2020] [Indexed: 11/24/2022]
Abstract
We propose a new citation model which builds on the existing models that explicitly or implicitly include “direct” and “indirect” (learning about a cited paper’s existence from references in another paper) citation mechanisms. Our model departs from the usual, unrealistic assumption of uniform probability of direct citation, in which initial differences in citation arise purely randomly. Instead, we demonstrate that a two-mechanism model in which the probability of direct citation is proportional to the number of authors on a paper (team size) is able to reproduce the empirical citation distributions of articles published in the field of astronomy remarkably well, and at different points in time. Interpretation of our model is that the intrinsic citation capacity, and hence the initial visibility of a paper, will be enhanced when more people are intimately familiar with some work, favoring papers from larger teams. While the intrinsic citation capacity cannot depend only on the team size, our model demonstrates that it must be to some degree correlated with it, and distributed in a similar way, i.e., having a power-law tail. Consequently, our team-size model qualitatively explains the existence of a correlation between the number of citations and the number of authors on a paper.
Collapse
|
12
|
Silva FN, Tandon A, Amancio DR, Flammini A, Menczer F, Milojević S, Fortunato S. Recency predicts bursts in the evolution of author citations. QUANTITATIVE SCIENCE STUDIES 2020. [DOI: 10.1162/qss_a_00070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
The citations process for scientific papers has been studied extensively. But while the citations accrued by authors are the sum of the citations of their papers, translating the dynamics of citation accumulation from the paper to the author level is not trivial. Here we conduct a systematic study of the evolution of author citations, and in particular their bursty dynamics. We find empirical evidence of a correlation between the number of citations most recently accrued by an author and the number of citations they receive in the future. Using a simple model where the probability for an author to receive new citations depends only on the number of citations collected in the previous 12–24 months, we are able to reproduce both the citation and burst size distributions of authors across multiple decades.
Collapse
Affiliation(s)
| | - Aditya Tandon
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, USA
| | - Diego Raphael Amancio
- Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, Brazil
| | - Alessandro Flammini
- Indiana University Network Science Institute, Indiana University, Bloomington, USA
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, USA
| | - Filippo Menczer
- Indiana University Network Science Institute, Indiana University, Bloomington, USA
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, USA
| | - Staša Milojević
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, USA
| | - Santo Fortunato
- Indiana University Network Science Institute, Indiana University, Bloomington, USA
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, USA
| |
Collapse
|
13
|
Abramo G, D’Angelo CA, Felici G. Informed peer review for publication assessments: Are improved impact measures worth the hassle? QUANTITATIVE SCIENCE STUDIES 2020. [DOI: 10.1162/qss_a_00051] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
In this work we ask whether and to what extent applying a predictor of a publication’s impact that is better than early citations has an effect on the assessment of the research performance of individual scientists. Specifically, we measure the total impact of Italian professors in the sciences and economics over time, valuing their publications first by early citations and then by a weighted combination of early citations and the impact factor of the hosting journal. As expected, the scores and ranks of the two indicators show a very strong correlation, but significant shifts occur in many fields, mainly in economics and statistics, and mathematics and computer science. The higher the share of uncited professors in a field and the shorter the citation time window, the more recommendable is recourse to the above combination.
Collapse
Affiliation(s)
- Giovanni Abramo
- Laboratory for Studies in Research Evaluation, Institute for System Analysis and Computer Science (IASI-CNR), National Research Council, Rome, Italy
| | | | - Giovanni Felici
- Institute for System Analysis and Computer Science (IASI-CNR), National Research Council, Rome, Italy
| |
Collapse
|
14
|
Abstract
Abstract
Purpose
Providing an overview of types of citation curves.
Design/methodology/approach
The terms citation curves or citation graphs are made explicit.
Findings
A framework for the study of diachronous (and synchronous) citation curves is proposed.
Research limitations
No new practical applications are given.
Practical implications
This short note about citation curves will help readers to make the optimal choice for their applications.
Originality/value
A new scheme for the meaning of the term “citation curve” is designed.
Collapse
|
15
|
Abstract
Most scientometricians reject the use of the journal impact factor for assessing individual articles and their authors. The well-known San Francisco Declaration on Research Assessment also strongly objects against this way of using the impact factor. Arguments against the use of the impact factor at the level of individual articles are often based on statistical considerations. The skewness of journal citation distributions typically plays a central role in these arguments. We present a theoretical analysis of statistical arguments against the use of the impact factor at the level of individual articles. Our analysis shows that these arguments do not support the conclusion that the impact factor should not be used for assessing individual articles. Using computer simulations, we demonstrate that under certain conditions the number of citations an article has received is a more accurate indicator of the value of the article than the impact factor. However, under other conditions, the impact factor is a more accurate indicator. It is important to critically discuss the dominant role of the impact factor in research evaluations, but the discussion should not be based on misplaced statistical arguments. Instead, the primary focus should be on the socio-technical implications of the use of the impact factor.
Collapse
Affiliation(s)
- Ludo Waltman
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
| | - Vincent A. Traag
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
| |
Collapse
|
16
|
Waltman L, Traag VA. Use of the journal impact factor for assessing individual articles need not be statistically wrong. F1000Res 2020; 9:366. [PMID: 33796272 PMCID: PMC7974631 DOI: 10.12688/f1000research.23418.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/17/2020] [Indexed: 07/22/2023] Open
Abstract
Most scientometricians reject the use of the journal impact factor for assessing individual articles and their authors. The well-known San Francisco Declaration on Research Assessment also strongly objects against this way of using the impact factor. Arguments against the use of the impact factor at the level of individual articles are often based on statistical considerations. The skewness of journal citation distributions typically plays a central role in these arguments. We present a theoretical analysis of statistical arguments against the use of the impact factor at the level of individual articles. Our analysis shows that these arguments do not support the conclusion that the impact factor should not be used for assessing individual articles. In fact, our computer simulations demonstrate the possibility that the impact factor is a more accurate indicator of the value of an article than the number of citations the article has received. It is important to critically discuss the dominant role of the impact factor in research evaluations, but the discussion should not be based on misplaced statistical arguments. Instead, the primary focus should be on the socio-technical implications of the use of the impact factor.
Collapse
Affiliation(s)
- Ludo Waltman
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
| | - Vincent A. Traag
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
| |
Collapse
|
17
|
|
18
|
Incorporating Word Significance into Aspect-Level Sentiment Analysis. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9173522] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Aspect-level sentiment analysis has drawn growing attention in recent years, with higher performance achieved through the attention mechanism. Despite this, previous research does not consider some human psychological evidence relating to language interpretation. This results in attention being paid to less significant words especially when the aspect word is far from the relevant context word or when an important context word is found at the end of a long sentence. We design a novel model using word significance to direct attention towards the most significant words, with novelty decay and incremental interpretation factors working together as an alternative for position based models. The interpretation factor represents the maximization of the degree each new encountered word contributes to the sentiment polarity and a counter balancing stretched exponential novelty decay factor represents decaying human reaction as a sentence gets longer. Our findings support the hypothesis that the attention mechanism needs to be applied to the most significant words for sentiment interpretation and that novelty decay is applicable in aspect-level sentiment analysis with a decay factor β = 0.7 .
Collapse
|
19
|
|
20
|
Poncela-Casasnovas J, Gerlach M, Aguirre N, Amaral LAN. Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria. Nat Hum Behav 2019; 3:568-575. [PMID: 30988477 DOI: 10.1038/s41562-019-0585-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 03/06/2019] [Indexed: 11/09/2022]
Abstract
The analysis of citations to scientific publications has become a tool that is used in the evaluation of a researcher's work; especially in the face of an ever-increasing production volume1-6. Despite the acknowledged shortcomings of citation analysis and the ongoing debate on the meaning of citations7,8, citations are still primarily viewed as endorsements and as indicators of the influence of the cited reference, regardless of the context of the citation. However, only recently has attention9,10 been given to the connection between contextual information and the success of citing and cited papers, primarily because of the lack of extensive databases that cover both types of metadata. Here we address this issue by studying the usage of citations throughout the full text of 156,558 articles published by the Public Library of Science (PLoS), and by tracing their bibliometric history from among 60 million records obtained from the Web of Science. We find universal patterns of variation in the usage of citations across paper sections11. Notably, we find differences in microlevel citation patterns that were dependent on the ultimate impact of the citing paper itself; publications from high-impact groups tend to cite younger references, as well as more very young and better-cited references. Our study provides a quantitative approach to addressing the long-standing issue that not all citations count the same.
Collapse
Affiliation(s)
| | - Martin Gerlach
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Nathan Aguirre
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Luís A N Amaral
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA. .,Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA. .,Department of Physics and Astronomy, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
21
|
Abramo G, D’Angelo CA, Felici G. Predicting publication long-term impact through a combination of early citations and journal impact factor. J Informetr 2019. [DOI: 10.1016/j.joi.2018.11.003] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
22
|
|
23
|
Cavinatto L, Bronson MJ, Chen DD, Moucha CS. Robotic-assisted versus standard unicompartmental knee arthroplasty—evaluation of manuscript conflict of interests, funding, scientific quality and bibliometrics. INTERNATIONAL ORTHOPAEDICS 2018; 43:1865-1871. [DOI: 10.1007/s00264-018-4175-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 09/18/2018] [Indexed: 11/29/2022]
|
24
|
Liu L, Wang Y, Sinatra R, Giles CL, Song C, Wang D. Hot streaks in artistic, cultural, and scientific careers. Nature 2018; 559:396-399. [PMID: 29995850 DOI: 10.1038/s41586-018-0315-8] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 06/04/2018] [Indexed: 11/10/2022]
Abstract
The hot streak-loosely defined as 'winning begets more winnings'-highlights a specific period during which an individual's performance is substantially better than his or her typical performance. Although hot streaks have been widely debated in sports1,2, gambling3-5 and financial markets6,7 over the past several decades, little is known about whether they apply to individual careers. Here, building on rich literature on the lifecycle of creativity8-22, we collected large-scale career histories of individual artists, film directors and scientists, tracing the artworks, films and scientific publications they produced. We find that, across all three domains, hit works within a career show a high degree of temporal regularity, with each career being characterized by bursts of high-impact works occurring in sequence. We demonstrate that these observations can be explained by a simple hot-streak model, allowing us to probe quantitatively the hot streak phenomenon governing individual careers. We find this phenomemon to be remarkably universal across diverse domains: hot streaks are ubiquitous yet usually unique across different careers. The hot streak emerges randomly within an individual's sequence of works, is temporally localized, and is not associated with any detectable change in productivity. We show that, because works produced during hot streaks garner substantially more impact, the uncovered hot streaks fundamentally drive the collective impact of an individual, and ignoring this leads us to systematically overestimate or underestimate the future impact of a career. These results not only deepen our quantitative understanding of patterns that govern individual ingenuity and success, but also may have implications for identifying and nurturing individuals whose work will have lasting impact.
Collapse
Affiliation(s)
- Lu Liu
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.,Kellogg School of Management, Northwestern University, Evanston, IL, USA.,College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, USA
| | - Yang Wang
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.,Kellogg School of Management, Northwestern University, Evanston, IL, USA
| | - Roberta Sinatra
- Department of Network and Data Science, and Department of Mathematics and its Applications, Central European University, Budapest, Hungary.,Center for Complex Network Research, Northeastern University, Boston, MA, USA.,Complexity Science Hub, Vienna, Austria
| | - C Lee Giles
- College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, USA.,Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
| | - Chaoming Song
- Department of Physics, University of Miami, Coral Gables, FL, USA
| | - Dashun Wang
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA. .,Kellogg School of Management, Northwestern University, Evanston, IL, USA. .,McCormick School of Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
25
|
Abstract
Scientific and scholarly influence is multifaceted, shifts over time, and varies across disciplines. We present a dynamic topic model to credit documents with influence that shapes future discourse based on their content and contextual features. We trace discursive innovation in scholarship and identify the influence of particular articles along with their authors, affiliations, and journals. In collections of science, social science, and humanities research spanning over a century, our measure helps predict citations and reveals signals that recognize authors who make diverse contributions and whose contributions take longer to be appreciated, allowing us to compensate for bias in citation behavior. Assessing scholarly influence is critical for understanding the collective system of scholarship and the history of academic inquiry. Influence is multifaceted, and citations reveal only part of it. Citation counts exhibit preferential attachment and follow a rigid “news cycle” that can miss sustained and indirect forms of influence. Building on dynamic topic models that track distributional shifts in discourse over time, we introduce a variant that incorporates features, such as authorship, affiliation, and publication venue, to assess how these contexts interact with content to shape future scholarship. We perform in-depth analyses on collections of physics research (500,000 abstracts; 102 years) and scholarship generally (JSTOR repository: 2 million full-text articles; 130 years). Our measure of document influence helps predict citations and shows how outcomes, such as winning a Nobel Prize or affiliation with a highly ranked institution, boost influence. Analysis of citations alongside discursive influence reveals that citations tend to credit authors who persist in their fields over time and discount credit for works that are influential over many topics or are “ahead of their time.” In this way, our measures provide a way to acknowledge diverse contributions that take longer and travel farther to achieve scholarly appreciation, enabling us to correct citation biases and enhance sensitivity to the full spectrum of scholarly impact.
Collapse
|
26
|
Fortunato S, Bergstrom CT, Börner K, Evans JA, Helbing D, Milojević S, Petersen AM, Radicchi F, Sinatra R, Uzzi B, Vespignani A, Waltman L, Wang D, Barabási AL. Science of science. Science 2018; 359:eaao0185. [PMID: 29496846 PMCID: PMC5949209 DOI: 10.1126/science.aao0185] [Citation(s) in RCA: 354] [Impact Index Per Article: 59.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Identifying fundamental drivers of science and developing predictive models to capture its evolution are instrumental for the design of policies that can improve the scientific enterprise-for example, through enhanced career paths for scientists, better performance evaluation for organizations hosting research, discovery of novel effective funding vehicles, and even identification of promising regions along the scientific frontier. The science of science uses large-scale data on the production of science to search for universal and domain-specific patterns. Here, we review recent developments in this transdisciplinary field.
Collapse
Affiliation(s)
- Santo Fortunato
- Center for Complex Networks and Systems Research, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA.
- Indiana University Network Science Institute, Indiana University, Bloomington, IN 47408, USA
| | - Carl T Bergstrom
- Department of Biology, University of Washington, Seattle, WA 98195-1800, USA
| | - Katy Börner
- Indiana University Network Science Institute, Indiana University, Bloomington, IN 47408, USA
- Cyberinfrastructure for Network Science Center, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
| | - James A Evans
- Department of Sociology, University of Chicago, Chicago, IL 60637, USA
| | - Dirk Helbing
- Computational Social Science, ETH Zurich, Zurich, Switzerland
| | - Staša Milojević
- Center for Complex Networks and Systems Research, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
| | - Alexander M Petersen
- Ernest and Julio Gallo Management Program, School of Engineering, University of California, Merced, CA 95343, USA
| | - Filippo Radicchi
- Center for Complex Networks and Systems Research, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
| | - Roberta Sinatra
- Center for Network Science, Central European University, Budapest 1052, Hungary
- Department of Mathematics, Central European University, Budapest 1051, Hungary
- Institute for Network Science, Northeastern University, Boston, MA 02115, USA
| | - Brian Uzzi
- Kellogg School of Management, Northwestern University, Evanston, IL 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL 60208, USA
| | - Alessandro Vespignani
- Institute for Network Science, Northeastern University, Boston, MA 02115, USA
- Laboratory for the Modeling of Biological and Sociotechnical Systems, Northeastern University, Boston, MA 02115, USA
- ISI Foundation, Turin 10133, Italy
| | - Ludo Waltman
- Centre for Science and Technology Studies, Leiden University, Leiden, Netherlands
| | - Dashun Wang
- Kellogg School of Management, Northwestern University, Evanston, IL 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL 60208, USA
| | - Albert-László Barabási
- Center for Network Science, Central European University, Budapest 1052, Hungary.
- Institute for Network Science, Northeastern University, Boston, MA 02115, USA
- Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
| |
Collapse
|
27
|
Lognormal distribution of citation counts is the reason for the relation between Impact Factors and Citation Success Index. J Informetr 2018. [DOI: 10.1016/j.joi.2017.12.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
28
|
Santangelo GM. Article-level assessment of influence and translation in biomedical research. Mol Biol Cell 2017; 28:1401-1408. [PMID: 28559438 PMCID: PMC5449139 DOI: 10.1091/mbc.e16-01-0037] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Revised: 03/21/2017] [Accepted: 04/05/2017] [Indexed: 01/08/2023] Open
Abstract
Given the vast scale of the modern scientific enterprise, it can be difficult for scientists to make judgments about the work of others through careful analysis of the entirety of the relevant literature. This has led to a reliance on metrics that are mathematically flawed and insufficiently diverse to account for the variety of ways in which investigators contribute to scientific progress. An urgent, critical first step in solving this problem is replacing the Journal Impact Factor with an article-level alternative. The Relative Citation Ratio (RCR), a metric that was designed to serve in that capacity, measures the influence of each publication on its respective area of research. RCR can serve as one component of a multifaceted metric that provides an effective data-driven supplement to expert opinion. Developing validated methods that quantify scientific progress can help to optimize the management of research investments and accelerate the acquisition of knowledge that improves human health.
Collapse
Affiliation(s)
- George M Santangelo
- Office of Portfolio Analysis, Division of Program Coordination, Planning, and Strategic Initiatives, National Institutes of Health, Bethesda, MD 20892
| |
Collapse
|
29
|
Abstract
We analyze time evolution of statistical distributions of citations to scientific papers published in the same year. While these distributions seem to follow the power-law dependence we find that they are nonstationary and the exponent of the power-law fit decreases with time and does not come to saturation. We attribute the nonstationarity of citation distributions to different longevity of the low-cited and highly cited papers. By measuring citation trajectories of papers we found that citation careers of the low-cited papers come to saturation after 10-15 years while those of the highly cited papers continue to increase indefinitely: The papers that exceed some citation threshold become runaways. Thus, we show that although citation distribution can look as a power-law dependence, it is not scale free and there is a hidden dynamic scale associated with the onset of runaways. We compare our measurements to our recently developed model of citation dynamics based on copying-redirection-triadic closure and find explanations to our empirical observations.
Collapse
Affiliation(s)
- Michael Golosovsky
- The Racah Institute of Physics, The Hebrew University of Jerusalem, 9190401 Jerusalem, Israel
| |
Collapse
|
30
|
|
31
|
|
32
|
Milojević S, Radicchi F, Bar-Ilan J. Citation success index − An intuitive pair-wise journal comparison metric. J Informetr 2017. [DOI: 10.1016/j.joi.2016.12.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
33
|
Golosovsky M, Solomon S. Growing complex network of citations of scientific papers: Modeling and measurements. Phys Rev E 2017; 95:012324. [PMID: 28208427 DOI: 10.1103/physreve.95.012324] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Indexed: 11/07/2022]
Abstract
We consider the network of citations of scientific papers and use a combination of the theoretical and experimental tools to uncover microscopic details of this network growth. Namely, we develop a stochastic model of citation dynamics based on the copying-redirection-triadic closure mechanism. In a complementary and coherent way, the model accounts both for statistics of references of scientific papers and for their citation dynamics. Originating in empirical measurements, the model is cast in such a way that it can be verified quantitatively in every aspect. Such validation is performed by measuring citation dynamics of physics papers. The measurements revealed nonlinear citation dynamics, the nonlinearity being intricately related to network topology. The nonlinearity has far-reaching consequences including nonstationary citation distributions, diverging citation trajectories of similar papers, runaways or "immortal papers" with infinite citation lifetime, etc. Thus nonlinearity in complex network growth is our most important finding. In a more specific context, our results can be a basis for quantitative probabilistic prediction of citation dynamics of individual papers and of the journal impact factor.
Collapse
Affiliation(s)
- Michael Golosovsky
- The Racah Institute of Physics, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel
| | - Sorin Solomon
- The Racah Institute of Physics, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel
| |
Collapse
|
34
|
Relative Citation Ratio (RCR): A New Metric That Uses Citation Rates to Measure Influence at the Article Level. PLoS Biol 2016; 14:e1002541. [PMID: 27599104 PMCID: PMC5012559 DOI: 10.1371/journal.pbio.1002541] [Citation(s) in RCA: 266] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 08/01/2016] [Indexed: 11/19/2022] Open
Abstract
Despite their recognized limitations, bibliometric assessments of scientific productivity have been widely adopted. We describe here an improved method to quantify the influence of a research article by making novel use of its co-citation network to field-normalize the number of citations it has received. Article citation rates are divided by an expected citation rate that is derived from performance of articles in the same field and benchmarked to a peer comparison group. The resulting Relative Citation Ratio is article level and field independent and provides an alternative to the invalid practice of using journal impact factors to identify influential papers. To illustrate one application of our method, we analyzed 88,835 articles published between 2003 and 2010 and found that the National Institutes of Health awardees who authored those papers occupy relatively stable positions of influence across all disciplines. We demonstrate that the values generated by this method strongly correlate with the opinions of subject matter experts in biomedical research and suggest that the same approach should be generally applicable to articles published in all areas of science. A beta version of iCite, our web tool for calculating Relative Citation Ratios of articles listed in PubMed, is available at https://icite.od.nih.gov.
Collapse
|
35
|
Multiple Citation Indicators and Their Composite across Scientific Disciplines. PLoS Biol 2016; 14:e1002501. [PMID: 27367269 PMCID: PMC4930269 DOI: 10.1371/journal.pbio.1002501] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 06/02/2016] [Indexed: 11/19/2022] Open
Abstract
Many fields face an increasing prevalence of multi-authorship, and this poses challenges in assessing citation metrics. Here, we explore multiple citation indicators that address total impact (number of citations, Hirsch H index [H]), co-authorship adjustment (Schreiber Hm index [Hm]), and author order (total citations to papers as single; single or first; or single, first, or last author). We demonstrate the correlation patterns between these indicators across 84,116 scientists (those among the top 30,000 for impact in a single year [2013] in at least one of these indicators) and separately across 12 scientific fields. Correlation patterns vary across these 12 fields. In physics, total citations are highly negatively correlated with indicators of co-authorship adjustment and of author order, while in other sciences the negative correlation is seen only for total citation impact and citations to papers as single author. We propose a composite score that sums standardized values of these six log-transformed indicators. Of the 1,000 top-ranked scientists with the composite score, only 322 are in the top 1,000 based on total citations. Many Nobel laureates and other extremely influential scientists rank among the top-1,000 with the composite indicator, but would rank much lower based on total citations. Conversely, many of the top 1,000 authors on total citations have had no single/first/last-authored cited paper. More Nobel laureates of 2011–2015 are among the top authors when authors are ranked by the composite score than by total citations, H index, or Hm index; 40/47 of these laureates are among the top 30,000 by at least one of the six indicators. We also explore the sensitivity of indicators to self-citation and alphabetic ordering of authors in papers across different scientific fields. Multiple indicators and their composite may give a more comprehensive picture of impact, although no citation indicator, single or composite, can be expected to select all the best scientists. Citation indicators addressing total impact, co-authorship, and author positions offer complementary insights about impact. This article shows that a composite score including six citation indicators identifies extremely influential scientists better than single indicators. Multiple citation indicators are used in science and scientific evaluation. With an increasing proportion of papers co-authored by many researchers, it is important to account for the relative contributions of different co-authors. We explored multiple citation indicators that address total impact, co-authorship adjustment, and author order (in particular, single, first, or last position authorships, since these positions suggest pivotal contributions to the work). We evaluated the top 30,000 scientists in 2013 based on each of six citation indicators (84,116 total scientists assessed) and also developed a composite score that combines the six indicators. Different scientists populated the top ranks when different indicators were used. Many Nobel laureates and other influential scientists rank among the top 1,000 with the composite indicator, but rank much lower based on total citations. Conversely, many of the top 1,000 authors on total citations had no single/first/last-authored cited paper. More Nobel laureates are among the top authors when authors are ranked by the composite score than by single indicators. Multiple indicators and their composite give a more comprehensive picture of impact, although no method can pick all the best scientists.
Collapse
|
36
|
|
37
|
Using Monte Carlo simulations to assess the impact of author name disambiguation quality on different bibliometric analyses. Scientometrics 2016. [DOI: 10.1007/s11192-016-1892-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
38
|
Diallo SY, Lynch CJ, Gore R, Padilla JJ. Identifying key papers within a journal via network centrality measures. Scientometrics 2016; 107:1005-1020. [PMID: 32214550 PMCID: PMC7088853 DOI: 10.1007/s11192-016-1891-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Indexed: 11/24/2022]
Abstract
This article examines the extent to which existing network centrality measures can be used (1) as filters to identify a set of papers to start reading within a journal and (2) as article-level metrics to identify the relative importance of a paper within a journal. We represent a dataset of published papers in the Public Library of Science (PLOS) via a co-citation network and compute three established centrality metrics for each paper in the network: closeness, betweenness, and eigenvector. Our results show that the network of papers in a journal is scale-free and that eigenvector centrality (1) is an effective filter and article-level metric and (2) correlates well with citation counts within a given journal. However, closeness centrality is a poor filter because articles fit within a small range of citations. We also show that betweenness centrality is a poor filter for journals with a narrow focus and a good filter for multidisciplinary journals where communities of papers can be identified.
Collapse
Affiliation(s)
- Saikou Y Diallo
- Virginia Modeling Analysis and Simulation Center, Old Dominion University, 1030 University Boulevard, Suffolk, VA 23435 USA
| | - Christopher J Lynch
- Virginia Modeling Analysis and Simulation Center, Old Dominion University, 1030 University Boulevard, Suffolk, VA 23435 USA
| | - Ross Gore
- Virginia Modeling Analysis and Simulation Center, Old Dominion University, 1030 University Boulevard, Suffolk, VA 23435 USA
| | - Jose J Padilla
- Virginia Modeling Analysis and Simulation Center, Old Dominion University, 1030 University Boulevard, Suffolk, VA 23435 USA
| |
Collapse
|
39
|
|
40
|
Moreira JAG, Zeng XHT, Amaral LAN. The Distribution of the Asymptotic Number of Citations to Sets of Publications by a Researcher or from an Academic Department Are Consistent with a Discrete Lognormal Model. PLoS One 2015; 10:e0143108. [PMID: 26571133 PMCID: PMC4646658 DOI: 10.1371/journal.pone.0143108] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 10/30/2015] [Indexed: 11/19/2022] Open
Abstract
How to quantify the impact of a researcher's or an institution's body of work is a matter of increasing importance to scientists, funding agencies, and hiring committees. The use of bibliometric indicators, such as the h-index or the Journal Impact Factor, have become widespread despite their known limitations. We argue that most existing bibliometric indicators are inconsistent, biased, and, worst of all, susceptible to manipulation. Here, we pursue a principled approach to the development of an indicator to quantify the scientific impact of both individual researchers and research institutions grounded on the functional form of the distribution of the asymptotic number of citations. We validate our approach using the publication records of 1,283 researchers from seven scientific and engineering disciplines and the chemistry departments at the 106 U.S. research institutions classified as "very high research activity". Our approach has three distinct advantages. First, it accurately captures the overall scientific impact of researchers at all career stages, as measured by asymptotic citation counts. Second, unlike other measures, our indicator is resistant to manipulation and rewards publication quality over quantity. Third, our approach captures the time-evolution of the scientific impact of research institutions.
Collapse
Affiliation(s)
- João A. G. Moreira
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, Illinois, United States of America
| | - Xiao Han T. Zeng
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, Illinois, United States of America
| | - Luís A. Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, Illinois, United States of America
- Department of Physics and Astronomy, Northwestern University, Evanston, Illinois, United States of America
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois, United States of America
- Howard Hughes Medical Institute, Northwestern University, Evanston, Illinois, United States of America
- * E-mail:
| |
Collapse
|
41
|
Bohlin L, Viamontes Esquivel A, Lancichinetti A, Rosvall M. Robustness of journal rankings by network flows with different amounts of memory. J Assoc Inf Sci Technol 2015. [DOI: 10.1002/asi.23582] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Ludvig Bohlin
- Integrated Science Lab; Department of Physics; Umeå University; Umeå SE-901 87 Sweden
| | | | - Andrea Lancichinetti
- Integrated Science Lab; Department of Physics; Umeå University; Umeå SE-901 87 Sweden
| | - Martin Rosvall
- Integrated Science Lab; Department of Physics; Umeå University; Umeå SE-901 87 Sweden
| |
Collapse
|
42
|
|
43
|
|
44
|
|
45
|
Assortative mixing, preferential attachment, and triadic closure: A longitudinal study of tie-generative mechanisms in journal citation networks. J Informetr 2015. [DOI: 10.1016/j.joi.2015.02.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
46
|
van der Pol CB, McInnes MDF, Petrcich W, Tunis AS, Hanna R. Is quality and completeness of reporting of systematic reviews and meta-analyses published in high impact radiology journals associated with citation rates? PLoS One 2015; 10:e0119892. [PMID: 25775455 PMCID: PMC4361663 DOI: 10.1371/journal.pone.0119892] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Accepted: 01/20/2015] [Indexed: 01/01/2023] Open
Abstract
Purpose The purpose of this study is to determine whether study quality and completeness of reporting of systematic reviews (SR) and meta-analyses (MA) published in high impact factor (IF) radiology journals is associated with citation rates. Methods All SR and MA published in English between Jan 2007–Dec 2011, in radiology journals with an IF >2.75, were identified on Ovid MEDLINE. The Assessing the Methodologic Quality of Systematic Reviews (AMSTAR) checklist for study quality, and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist for study completeness, was applied to each SR & MA. Each SR & MA was then searched in Google Scholar to yield a citation rate. Spearman correlation coefficients were used to assess the relationship between AMSTAR and PRISMA results with citation rate. Multivariate analyses were performed to account for the effect of journal IF and journal 5-year IF on correlation with citation rate. Values were reported as medians with interquartile range (IQR) provided. Results 129 studies from 11 journals were included (50 SR and 79 MA). Median AMSTAR result was 8.0/11 (IQR: 5–9) and median PRISMA result was 23.0/27 (IQR: 21–25). The median citation rate for SR & MA was 0.73 citations/month post-publication (IQR: 0.40–1.17). There was a positive correlation between both AMSTAR and PRISMA results and SR & MA citation rate; ρ=0.323 (P=0.0002) and ρ=0.327 (P=0.0002) respectively. Positive correlation persisted for AMSTAR and PRISMA results after journal IF was partialed out; ρ=0.243 (P=0.006) and ρ=0.256 (P=0.004), and after journal 5-year IF was partialed out; ρ=0.235 (P=0.008) and ρ=0.243 (P=0.006) respectively. Conclusion There is a positive correlation between the quality and the completeness of a reported SR or MA with citation rate which persists when adjusted for journal IF and journal 5-year IF.
Collapse
Affiliation(s)
| | - Matthew D. F. McInnes
- Department of Radiology, University of Ottawa, Ottawa, Ontario, Canada
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
- * E-mail:
| | - William Petrcich
- Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - Adam S. Tunis
- Department of Radiology, University of Ottawa, Ottawa, Ontario, Canada
| | - Ramez Hanna
- Department of Radiology, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
47
|
Abstract
Modeling distributions of citations to scientific papers is crucial for understanding how science develops. However, there is a considerable empirical controversy on which statistical model fits the citation distributions best. This paper is concerned with rigorous empirical detection of power-law behaviour in the distribution of citations received by the most highly cited scientific papers. We have used a
large, novel data set on citations to scientific papers published between 1998 and 2002 drawn from Scopus. The power-law model is compared with a number of alternative models using a likelihood ratio test. We have found that the power-law hypothesis is rejected for around half of the Scopus fields of science. For these fields of science, the Yule, power-law with exponential cut-off and log-normal distributions seem to fit the data better than the pure power-law model. On the other hand, when the power-law hypothesis is not rejected, it is usually empirically indistinguishable from most of the alternative models. The pure power-law model seems to be the best model only for the most highly cited papers in “Physics and Astronomy”. Overall, our results seem to support theories implying that the most highly cited scientific papers follow the Yule, power-law with exponential cut-off or log-normal distribution. Our findings suggest also that power laws in citation distributions, when present, account only for a very small fraction of the published papers (less than 1 % for most of science fields) and that the power-law scaling parameter (exponent) is substantially higher (from around 3.2 to around 4.7) than found in the older literature.
Collapse
Affiliation(s)
- Michal Brzezinski
- Faculty of Economic Sciences, University of Warsaw, Dluga 44/50, 00-241 Warsaw, Poland
| |
Collapse
|
48
|
Stern DI. High-ranked social science journal articles can be identified from early citation information. PLoS One 2014; 9:e112520. [PMID: 25390035 PMCID: PMC4229225 DOI: 10.1371/journal.pone.0112520] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Accepted: 09/08/2014] [Indexed: 11/19/2022] Open
Abstract
Do citations accumulate too slowly in the social sciences to be used to assess the quality of recent articles? I investigate whether this is the case using citation data for all articles in economics and political science published in 2006 and indexed in the Web of Science. I find that citations in the first two years after publication explain more than half of the variation in cumulative citations received over a longer period. Journal impact factors improve the correlation between the predicted and actual future ranks of journal articles when using citation data from 2006 alone but the effect declines sharply thereafter. Finally, more than half of the papers in the top 20% in 2012 were already in the top 20% in the year of publication (2006).
Collapse
Affiliation(s)
- David I. Stern
- Crawford School of Public Policy, The Australian National University, Acton, Australian Capital Territory, Australia
- * E-mail:
| |
Collapse
|
49
|
Annalingam A, Damayanthi H, Jayawardena R, Ranasinghe P. Determinants of the citation rate of medical research publications from a developing country. SPRINGERPLUS 2014; 3:140. [PMID: 25674441 PMCID: PMC4320248 DOI: 10.1186/2193-1801-3-140] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Accepted: 03/11/2014] [Indexed: 01/07/2023]
Abstract
Background The number of citations received by an article is considered as an objective marker judging the importance and the quality of the research work. The present study aims to study the determinants of citations for research articles published by Sri Lankan authors. Methods Papers were selectively retrieved from the SciVerse Scopus® (Elsevier Properties S.A, USA) database for 10 years from 1st January 1997 to 31st December 2006, of which 50% were selected for inclusion by simple random sampling. The primary outcome measure was citation rate (defined as the number of citations during the 2 subsequent years after publication). Citation data was collected using the SciVerse Scopus® Citation Analyzer and self citations were excluded. A linear regression analysis was performed with ‘number of citations’ as the continuous dependent variable and other independent variables. Result The number of publications has steadily increased during the period of study. Over three quarter of papers were published in international journals. More than half of publications were research studies (55.3%), and most of the research studies were descriptive cross-sectional studies (27.1%). The mean number of citations within 2 years of publication was 1.7 and 52.1% of papers were not cited within the first two years of publication. The mean number of citations for collaborative studies (2.74) was significantly higher than that of non-collaborative studies (0.66). The mean number of citations did not significantly change depending on whether the publication had a positive result (2.08) or not (2.92) and was also not influenced by the presence (2.30) or absence (1.99) of the main study conclusion in the title of the article. In the linear regression model, the journal rank, number of authors, conducting the study abroad, being a research study or systematic review/meta-analysis and having regional and/or international collaboration all significantly increased the number of citations. Conclusion The journal rank, number of authors, conducting the study abroad, being a research study or systematic review/meta-analysis and having regional and/or international collaboration all significantly increased the number of citations. However, the presence of a positive result in the study did not influence the citation rate.
Collapse
Affiliation(s)
- Anupama Annalingam
- Department of Pharmacology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
| | - Hasitha Damayanthi
- Department of Pharmacology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
| | - Ranil Jayawardena
- Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland Australia
| | - Priyanga Ranasinghe
- Department of Pharmacology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
| |
Collapse
|
50
|
A proposal for a novel impact factor as an alternative to the JCR impact factor. Sci Rep 2013; 3:3410. [PMID: 24296521 PMCID: PMC3847704 DOI: 10.1038/srep03410] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2013] [Accepted: 11/18/2013] [Indexed: 11/29/2022] Open
Abstract
One disadvantage of the JCR impact factor, the most commonly used assessment tool for ranking and evaluating scientific journals, is its inability in distinguishing among different shapes of citation distribution curves, leading to unfair evaluation of journals in some cases. This paper aims to put forward an alternative impact factor (IF′) that can properly reflect citation distributions. The two impact factors are linearly and positively correlated, and have roughly the same order of magnitude. Because of the ability of IF′ in distinguishing among different shapes of citation distribution curves, IF′ may properly reflect the academic performance of a scientific journal in a way that is different from the JCR impact factor with some unique features that reward journals with highly cited papers. Therefore, it is suggested that IF′ could be used to complement the JCR impact factor.
Collapse
|