1
|
Perotti JI, Almeira N, Saracco F. Towards a generalization of information theory for hierarchical partitions. Phys Rev E 2020; 101:062148. [PMID: 32688491 DOI: 10.1103/physreve.101.062148] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 06/02/2020] [Indexed: 11/07/2022]
Abstract
Complex systems often exhibit multiple levels of organization covering a wide range of physical scales, so the study of the hierarchical decomposition of their structure and function is frequently convenient. To better understand this phenomenon, we introduce a generalization of information theory that works with hierarchical partitions. We begin revisiting the recently introduced hierarchical mutual information (HMI), and show that it can be written as a level by level summation of classical conditional mutual information terms. Then, we prove that the HMI is bounded from above by the corresponding hierarchical joint entropy. In this way, in analogy to the classical case, we derive hierarchical generalizations of many other classical information-theoretic quantities. In particular, we prove that, as opposed to its classical counterpart, the hierarchical generalization of the variation of information is not a metric distance, but it admits a transformation into one. Moreover, focusing on potential applications of the existing developments of the theory, we show how to adjust by chance the HMI. We also corroborate and analyze all the presented theoretical results with exhaustive numerical computations, and include an illustrative application example of the introduced formalism. Finally, we mention some open problems that should be eventually addressed for the proposed generalization of information theory to reach maturity.
Collapse
Affiliation(s)
- Juan Ignacio Perotti
- Facultad de Matemática, Astronomía, Física y Computación, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, Argentina.,Instituto de Física Enrique Gaviola (IFEG-CONICET), Ciudad Universitaria, Córdoba, Argentina
| | - Nahuel Almeira
- Facultad de Matemática, Astronomía, Física y Computación, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, Argentina.,Instituto de Física Enrique Gaviola (IFEG-CONICET), Ciudad Universitaria, Córdoba, Argentina
| | - Fabio Saracco
- IMT School for Advanced Studies Lucca, Piazza San Francesco 19, I-55100, Lucca, Italy
| |
Collapse
|
2
|
Mapping the Landscape and Evolutions of Green Supply Chain Management. SUSTAINABILITY 2018. [DOI: 10.3390/su10030597] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
3
|
Letizia E, Barucca P, Lillo F. Resolution of ranking hierarchies in directed networks. PLoS One 2018; 13:e0191604. [PMID: 29394278 PMCID: PMC5796714 DOI: 10.1371/journal.pone.0191604] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Accepted: 01/08/2018] [Indexed: 11/19/2022] Open
Abstract
Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit.
Collapse
Affiliation(s)
- Elisa Letizia
- Scuola Normale Superiore, Piazza dei Cavalieri 7, Pisa, 56126, Italy
- * E-mail:
| | - Paolo Barucca
- University of Zurich, Schönberggasse 1, Zürich, 8001 Switzerland
- LIMS, 35a South Street, London, W1K 2XF, United Kingdom
| | - Fabrizio Lillo
- Department of Mathematics, University of Bologna, Piazza di Porta San Donato 5, Bologna, 40126, Italy
| |
Collapse
|
4
|
|
5
|
Abstract
Abstract
Purpose
We present a systematic review of the literature concerning major aspects of science mapping to serve two primary purposes: First, to demonstrate the use of a science mapping approach to perform the review so that researchers may apply the procedure to the review of a scientific domain of their own interest, and second, to identify major areas of research activities concerning science mapping, intellectual milestones in the development of key specialties, evolutionary stages of major specialties involved, and the dynamics of transitions from one specialty to another.
Design/methodology/approach
We first introduce a theoretical framework of the evolution of a scientific specialty. Then we demonstrate a generic search strategy that can be used to construct a representative dataset of bibliographic records of a domain of research. Next, progressively synthesized co-citation networks are constructed and visualized to aid visual analytic studies of the domain’s structural and dynamic patterns and trends. Finally, trajectories of citations made by particular types of authors and articles are presented to illustrate the predictive potential of the analytic approach.
Findings
The evolution of the science mapping research involves the development of a number of interrelated specialties. Four major specialties are discussed in detail in terms of four evolutionary stages: conceptualization, tool construction, application, and codification. Underlying connections between major specialties are also explored. The predictive analysis demonstrates citations trajectories of potentially transformative contributions.
Research limitations
The systematic review is primarily guided by citation patterns in the dataset retrieved from the literature. The scope of the data is limited by the source of the retrieval, i.e. the Web of Science, and the composite query used. An iterative query refinement is possible if one would like to improve the data quality, although the current approach serves our purpose adequately. More in-depth analyses of each specialty would be more revealing by incorporating additional methods such as citation context analysis and studies of other aspects of scholarly publications.
Practical implications
The underlying analytic process of science mapping serves many practical needs, notably bibliometric mapping, knowledge domain visualization, and visualization of scientific literature. In order to master such a complex process of science mapping, researchers often need to develop a diverse set of skills and knowledge that may span multiple disciplines. The approach demonstrated in this article provides a generic method for conducting a systematic review.
Originality/value
Incorporating the evolutionary stages of a specialty into the visual analytic study of a research domain is innovative. It provides a systematic methodology for researchers to achieve a good understanding of how scientific fields evolve, to recognize potentially insightful patterns from visually encoded signs, and to synthesize various information so as to capture the state of the art of the domain.
Collapse
Affiliation(s)
- Chaomei Chen
- College of Computing and Informatics , Drexel University , Philadelphia , PA 19104-2875 , USA
| |
Collapse
|
6
|
Kitsak M, Papadopoulos F, Krioukov D. Latent geometry of bipartite networks. Phys Rev E 2017; 95:032309. [PMID: 28415237 DOI: 10.1103/physreve.95.032309] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Indexed: 06/07/2023]
Abstract
Despite the abundance of bipartite networked systems, their organizing principles are less studied compared to unipartite networks. Bipartite networks are often analyzed after projecting them onto one of the two sets of nodes. As a result of the projection, nodes of the same set are linked together if they have at least one neighbor in common in the bipartite network. Even though these projections allow one to study bipartite networks using tools developed for unipartite networks, one-mode projections lead to significant loss of information and artificial inflation of the projected network with fully connected subgraphs. Here we pursue a different approach for analyzing bipartite systems that is based on the observation that such systems have a latent metric structure: network nodes are points in a latent metric space, while connections are more likely to form between nodes separated by shorter distances. This approach has been developed for unipartite networks, and relatively little is known about its applicability to bipartite systems. Here, we fully analyze a simple latent-geometric model of bipartite networks and show that this model explains the peculiar structural properties of many real bipartite systems, including the distributions of common neighbors and bipartite clustering. We also analyze the geometric information loss in one-mode projections in this model and propose an efficient method to infer the latent pairwise distances between nodes. Uncovering the latent geometry underlying real bipartite networks can find applications in diverse domains, ranging from constructing efficient recommender systems to understanding cell metabolism.
Collapse
Affiliation(s)
- Maksim Kitsak
- Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA
| | - Fragkiskos Papadopoulos
- Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology, 33 Saripolou Street, 3036 Limassol, Cyprus
| | - Dmitri Krioukov
- Department of Physics, Department of Mathematics, Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115, USA
| |
Collapse
|
7
|
Comparing the Hierarchy of Keywords in On-Line News Portals. PLoS One 2016; 11:e0165728. [PMID: 27802319 PMCID: PMC5089747 DOI: 10.1371/journal.pone.0165728] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Accepted: 10/17/2016] [Indexed: 11/19/2022] Open
Abstract
Hierarchical organization is prevalent in networks representing a wide range of systems in nature and society. An important example is given by the tag hierarchies extracted from large on-line data repositories such as scientific publication archives, file sharing portals, blogs, on-line news portals, etc. The tagging of the stored objects with informative keywords in such repositories has become very common, and in most cases the tags on a given item are free words chosen by the authors independently. Therefore, the relations among keywords appearing in an on-line data repository are unknown in general. However, in most cases the topics and concepts described by these keywords are forming a latent hierarchy, with the more general topics and categories at the top, and more specialized ones at the bottom. There are several algorithms available for deducing this hierarchy from the statistical features of the keywords. In the present work we apply a recent, co-occurrence-based tag hierarchy extraction method to sets of keywords obtained from four different on-line news portals. The resulting hierarchies show substantial differences not just in the topics rendered as important (being at the top of the hierarchy) or of less interest (categorized low in the hierarchy), but also in the underlying network structure. This reveals discrepancies between the plausible keyword association frameworks in the studied news portals.
Collapse
|
8
|
Popović M, Štefančić H, Sluban B, Kralj Novak P, Grčar M, Mozetič I, Puliga M, Zlatić V. Extraction of temporal networks from term co-occurrences in online textual sources. PLoS One 2014; 9:e99515. [PMID: 25470498 PMCID: PMC4254290 DOI: 10.1371/journal.pone.0099515] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 05/15/2014] [Indexed: 01/29/2023] Open
Abstract
A stream of unstructured news can be a valuable source of hidden relations between different entities, such as financial institutions, countries, or persons. We present an approach to continuously collect online news, recognize relevant entities in them, and extract time-varying networks. The nodes of the network are the entities, and the links are their co-occurrences. We present a method to estimate the significance of co-occurrences, and a benchmark model against which their robustness is evaluated. The approach is applied to a large set of financial news, collected over a period of two years. The entities we consider are 50 countries which issue sovereign bonds, and which are insured by Credit Default Swaps (CDS) in turn. We compare the country co-occurrence networks to the CDS networks constructed from the correlations between the CDS. The results show relatively small, but significant overlap between the networks extracted from the news and those from the CDS correlations.
Collapse
Affiliation(s)
- Marko Popović
- Theoretical Physics Division, Rudjer Bošković Institute, P.O.Box 180, HR-10002, Zagreb, Croatia
| | - Hrvoje Štefančić
- Theoretical Physics Division, Rudjer Bošković Institute, P.O.Box 180, HR-10002, Zagreb, Croatia
- Catholic University of Croatia, Zagreb, Croatia
| | | | | | - Miha Grčar
- Jožef Stefan Institute, Ljubljana, Slovenia
| | | | | | - Vinko Zlatić
- Theoretical Physics Division, Rudjer Bošković Institute, P.O.Box 180, HR-10002, Zagreb, Croatia
- * E-mail:
| |
Collapse
|