1
|
Intelligent Classification Method of Archive Data Based on Multigranular Semantics. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7559523. [PMID: 35607460 PMCID: PMC9124107 DOI: 10.1155/2022/7559523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/09/2022] [Accepted: 04/15/2022] [Indexed: 11/17/2022]
Abstract
With the rapid development of information technology, the amount of data in various digital archives has exploded. How to reasonably mine and analyze archive data and improve the effect of intelligent management of newly included archives has become an urgent problem to be solved. The existing archival data classification method is manual classification oriented to management needs. This manual classification method is inefficient and ignores the inherent content information of the archives. In addition, for the discovery and utilization of archive information, it is necessary to further explore and analyze the correlation between the contents of the archive data. Facing the needs of intelligent archive management, from the perspective of the text content of archive data, further analysis of manually classified archives is carried out. Therefore, this paper proposes an intelligent classification method for archive data based on multigranular semantics. First, it constructs a semantic-label multigranular attention model; that is, the output of the stacked expanded convolutional coding module and the label graph attention module are jointly connected to the multigranular attention Mechanism network, the weighted label output by the multigranularity attention mechanism network is used as the input of the fully connected layer, and the output value of the fully connected layer used to map the predicted label is input into a Sigmoid layer to obtain the predicted probability of each label; then, the model for training: use the multilabel data set to train the constructed semantic-label multigranularity attention model, adjust the parameters until the semantic-label multigranularity attention model converges, and obtain the trained semantic-label multigranularity attention model. Taking the multilabel data set to be classified as input, the semantic-label multigranularity attention model after training outputs the classification result.
Collapse
|
2
|
Garbulowski M, Diamanti K, Smolińska K, Baltzer N, Stoll P, Bornelöv S, Øhrn A, Feuk L, Komorowski J. R.ROSETTA: an interpretable machine learning framework. BMC Bioinformatics 2021; 22:110. [PMID: 33676405 PMCID: PMC7937228 DOI: 10.1186/s12859-021-04049-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 02/24/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more important to understand how a prediction was obtained rather than knowing what prediction was made. To this end so-called interpretable machine learning has been recently advocated. In this study, we implemented an interpretable machine learning package based on the rough set theory. An important aim of our work was provision of statistical properties of the models and their components. RESULTS We present the R.ROSETTA package, which is an R wrapper of ROSETTA framework. The original ROSETTA functions have been improved and adapted to the R programming environment. The package allows for building and analyzing non-linear interpretable machine learning models. R.ROSETTA gathers combinatorial statistics via rule-based modelling for accessible and transparent results, well-suited for adoption within the greater scientific community. The package also provides statistics and visualization tools that facilitate minimization of analysis bias and noise. The R.ROSETTA package is freely available at https://github.com/komorowskilab/R.ROSETTA . To illustrate the usage of the package, we applied it to a transcriptome dataset from an autism case-control study. Our tool provided hypotheses for potential co-predictive mechanisms among features that discerned phenotype classes. These co-predictors represented neurodevelopmental and autism-related genes. CONCLUSIONS R.ROSETTA provides new insights for interpretable machine learning analyses and knowledge-based systems. We demonstrated that our package facilitated detection of dependencies for autism-related genes. Although the sample application of R.ROSETTA illustrates transcriptome data analysis, the package can be used to analyze any data organized in decision tables.
Collapse
Affiliation(s)
- Mateusz Garbulowski
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Klev Diamanti
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Karolina Smolińska
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Nicholas Baltzer
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Department of Research, Cancer Registry of Norway, Oslo, Norway
| | - Patricia Stoll
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Department of Biosystems Science and Engineering, ETH Zurich, Zurich, Switzerland
| | - Susanne Bornelöv
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | | | - Lars Feuk
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Jan Komorowski
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden.
- Swedish Collegium for Advanced Study, Uppsala, Sweden.
- Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland.
- Washington National Primate Research Center, Seattle, WA, USA.
| |
Collapse
|
3
|
Gou H, Zhang X. Compromised multi-granulation rough sets based on an attribute-extension chain. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-200708] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The multi-granulation rough sets serve as important hierarchical models for intelligent systems. However, their mainstream optimistic and pessimistic models are respectively too loose and strict, and this defect becomes especially serious in hierarchical processing on an attribute-expansion sequence. Aiming at the attribute-addition chain, compromised multi-granulation rough set models are proposed to systematically complement and balance the optimistic and pessimistic models. According to the knowledge refinement and measure order induced by the attribute-enlargement sequence, the basic measurement positioning and corresponding pointer labeling based on equilibrium statistics are used, and thus we construct four types of compromised models at three levels of knowledge, approximation, and accuracy. At the knowledge level, the median positioning of ordered granulations derives Compromised-Model 1; at the approximation level, the average positioning of approximation cardinalities is performed, and thus the separation and integration of dual approximations respectively generate Compromised-Models 2 and 3; at the accuracy level, the average positioning of applied accuracies yields Compromised-Model 4. Compromised-Models 1–4 adopt distinctive cognitive levels and statistical perspectives to improve and perfect the multi-granulation rough sets, and their properties and effectiveness are finally verified by information systems and data experiments.
Collapse
Affiliation(s)
- Hongyuan Gou
- School of Mathematical Sciences, Sichuan Normal University, Chengdu, China
- Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu, China
| | - Xianyong Zhang
- School of Mathematical Sciences, Sichuan Normal University, Chengdu, China
- Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu, China
| |
Collapse
|
5
|
EF_Unique: An Improved Version of Unsupervised Equal Frequency Discretization Method. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2018. [DOI: 10.1007/s13369-018-3144-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|