1
|
Explaining Aha! moments in artificial agents through IKE-XAI: Implicit Knowledge Extraction for eXplainable AI. Neural Netw 2022; 155:95-118. [DOI: 10.1016/j.neunet.2022.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 07/21/2022] [Accepted: 08/01/2022] [Indexed: 11/20/2022]
|
2
|
Extracting automata from recurrent neural networks using queries and counterexamples (extended version). Mach Learn 2022. [DOI: 10.1007/s10994-022-06163-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
3
|
Kaadoud IC, Rougier NP, Alexandre F. Knowledge extraction from the learning of sequences in a long short term memory (LSTM) architecture. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107657] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
4
|
Oliva C, Lago-Fernández LF. Stability of internal states in recurrent neural networks trained on regular languages. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.04.058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
5
|
Jun SY, Shin HS. Analysis and Prediction of Surface Condition of Artificial Skin Based on CNN and ConvLSTM. BIOTECHNOL BIOPROC E 2021. [DOI: 10.1007/s12257-020-0253-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Distillation of weighted automata from recurrent neural networks using a spectral approach. Mach Learn 2021. [DOI: 10.1007/s10994-021-05948-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
7
|
Xu Z, Wen C, Qin S, He M. Extracting automata from neural networks using active learning. PeerJ Comput Sci 2021; 7:e436. [PMID: 33977128 PMCID: PMC8064235 DOI: 10.7717/peerj-cs.436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 02/17/2021] [Indexed: 06/12/2023]
Abstract
Deep learning is one of the most advanced forms of machine learning. Most modern deep learning models are based on an artificial neural network, and benchmarking studies reveal that neural networks have produced results comparable to and in some cases superior to human experts. However, the generated neural networks are typically regarded as incomprehensible black-box models, which not only limits their applications, but also hinders testing and verifying. In this paper, we present an active learning framework to extract automata from neural network classifiers, which can help users to understand the classifiers. In more detail, we use Angluin's L* algorithm as a learner and the neural network under learning as an oracle, employing abstraction interpretation of the neural network for answering membership and equivalence queries. Our abstraction consists of value, symbol and word abstractions. The factors that may affect the abstraction are also discussed in the paper. We have implemented our approach in a prototype. To evaluate it, we have performed the prototype on a MNIST classifier and have identified that the abstraction with interval number 2 and block size 1 × 28 offers the best performance in terms of F1 score. We also have compared our extracted DFA against the DFAs learned via the passive learning algorithms provided in LearnLib and the experimental results show that our DFA gives a better performance on the MNIST dataset.
Collapse
Affiliation(s)
- Zhiwu Xu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Cheng Wen
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Shengchao Qin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, United Kingdom
| | - Mengda He
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, United Kingdom
| |
Collapse
|
8
|
Synthesizing Context-free Grammars from Recurrent Neural Networks. TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS 2021. [PMCID: PMC7979173 DOI: 10.1007/978-3-030-72016-2_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
We present an algorithm for extracting a subclass of the context free grammars (CFGs) from a trained recurrent neural network (RNN). We develop a new framework, pattern rule sets (PRSs), which describe sequences of deterministic finite automata (DFAs) that approximate a non-regular language. We present an algorithm for recovering the PRS behind a sequence of such automata, and apply it to the sequences of automata extracted from trained RNNs using the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$L^{*}$$\end{document}L∗ algorithm. We then show how the PRS may converted into a CFG, enabling a familiar and useful presentation of the learned language. Extracting the learned language of an RNN is important to facilitate understanding of the RNN and to verify its correctness. Furthermore, the extracted CFG can augment the RNN in classifying correct sentences, as the RNN’s predictive accuracy decreases when the recursion depth and distance between matching delimiters of its input sequences increases.
Collapse
|
9
|
Learning behavioral models by recurrent neural networks with discrete latent representations with application to a flexible industrial conveyor. COMPUT IND 2020. [DOI: 10.1016/j.compind.2020.103263] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
10
|
Townsend J, Chaton T, Monteiro JM. Extracting Relational Explanations From Deep Neural Networks: A Survey From a Neural-Symbolic Perspective. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3456-3470. [PMID: 31689216 DOI: 10.1109/tnnls.2019.2944672] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The term "explainable AI" refers to the goal of producing artificially intelligent agents that are capable of providing explanations for their decisions. Some models (e.g., rule-based systems) are designed to be explainable, while others are less explicit "black boxes" for which their reasoning remains a mystery. One example of the latter is the neural network, and over the past few decades, researchers in the field of neural-symbolic integration (NSI) have sought to extract relational knowledge from such networks. Extraction from deep neural networks, however, has remained a challenge until recent years in which many methods of extracting distinct, salient features from input or hidden feature spaces of deep neural networks have been proposed. Furthermore, methods of identifying relationships between these features have also emerged. This article presents examples of old and new developments in extracting relational explanations in order to argue that the latter have analogies in the former and, as such, can be described in terms of long-established taxonomies and frameworks presented in early neural-symbolic literature. We also outline potential future research directions that come to light from this refreshed perspective.
Collapse
|
11
|
Mantas C. Interpretation of first-order recurrent neural networks by means of fuzzy rules. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2019. [DOI: 10.3233/jifs-190215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- C.J. Mantas
- Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
| |
Collapse
|
12
|
Wang Q, Zhang K, Ororbia Ii AG, Xing X, Liu X, Giles CL. An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks. Neural Comput 2018; 30:2568-2591. [PMID: 30021081 DOI: 10.1162/neco_a_01111] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Rule extraction from black box models is critical in domains that require model validation before implementation, as can be the case in credit scoring and medical diagnosis. Though already a challenging problem in statistical learning in general, the difficulty is even greater when highly nonlinear, recursive models, such as recurrent neural networks (RNNs), are fit to data. Here, we study the extraction of rules from second-order RNNs trained to recognize the Tomita grammars. We show that production rules can be stably extracted from trained RNNs and that in certain cases, the rules outperform the trained RNNs.
Collapse
Affiliation(s)
| | - Kaixuan Zhang
- Pennsylvania State University, State College, PA 16801, U.S.A.
| | | | - Xinyu Xing
- Pennsylvania State University, State College, PA 16801, U.S.A.
| | - Xue Liu
- McGill University, Montreal, Quebec H3A 0G4, Canada
| | - C Lee Giles
- Pennsylvania State University, State College, PA 16801, U.S.A.
| |
Collapse
|
13
|
Kim ZM, Oh H, Kim HG, Lim CG, Oh KJ, Choi HJ. Modeling long-term human activeness using recurrent neural networks for biometric data. BMC Med Inform Decis Mak 2017; 17:57. [PMID: 28539116 PMCID: PMC5444042 DOI: 10.1186/s12911-017-0453-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Background With the invention of fitness trackers, it has been possible to continuously monitor a user’s biometric data such as heart rates, number of footsteps taken, and amount of calories burned. This paper names the time series of these three types of biometric data, the user’s “activeness”, and investigates the feasibility in modeling and predicting the long-term activeness of the user. Methods The dataset used in this study consisted of several months of biometric time-series data gathered by seven users independently. Four recurrent neural network (RNN) architectures–as well as a deep neural network and a simple regression model–were proposed to investigate the performance on predicting the activeness of the user under various length-related hyper-parameter settings. In addition, the learned model was tested to predict the time period when the user’s activeness falls below a certain threshold. Results A preliminary experimental result shows that each type of activeness data exhibited a short-term autocorrelation; and among the three types of data, the consumed calories and the number of footsteps were positively correlated, while the heart rate data showed almost no correlation with neither of them. It is probably due to this characteristic of the dataset that although the RNN models produced the best results on modeling the user’s activeness, the difference was marginal; and other baseline models, especially the linear regression model, performed quite admirably as well. Further experimental results show that it is feasible to predict a user’s future activeness with precision, for example, a trained RNN model could predict–with the precision of 84%–when the user would be less active within the next hour given the latest 15 min of his activeness data. Conclusions This paper defines and investigates the notion of a user’s “activeness”, and shows that forecasting the long-term activeness of the user is indeed possible. Such information can be utilized by a health-related application to proactively recommend suitable events or services to the user.
Collapse
Affiliation(s)
- Zae Myung Kim
- School of Computing, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea
| | - Hyungrai Oh
- Samsung Seoul R&D Campus, Samsung Electronics, 33 Seongchon-gil, Seocho-gu, Seoul, 06765, South Korea
| | - Han-Gyu Kim
- School of Computing, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea
| | - Chae-Gyun Lim
- School of Computing, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea
| | - Kyo-Joong Oh
- School of Computing, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea
| | - Ho-Jin Choi
- School of Computing, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea.
| |
Collapse
|
14
|
Whiteside D, Reid M. Spatial characteristics of professional tennis serves with implications for serving aces: A machine learning approach. J Sports Sci 2016; 35:648-654. [PMID: 27189847 DOI: 10.1080/02640414.2016.1183805] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
This study sought to determine the features of an ideal serve in men's professional tennis. A total of 25,680 first serves executed by 151 male tennis players during Australian Open competition were classified as either aces or returned into play. Spatiotemporal (impact location, speed, projection angles, landing location and relative player locations) and contextual (score) features of each serve were extracted from Hawk-Eye data and used to construct a classification tree model (with decision rules) that predicted serve outcome. k-means clustering was applied to the landing locations to quantify optimal landing locations for aces. The classification tree revealed that (1) serve directionality, relative to the returner; (2) the ball's landing proximity to the nearest service box line and (3) serve speed classified aces with an accuracy of 87.02%. Hitting aces appeared more contingent on accuracy than speed, with serves directed >5.88° from the returner and landing <15.27 cm from a service box line most indicative of an ace. k-means clustering revealed four distinct locations (≈0.73 m wide × 2.35 m deep) in the corners of the service box that corresponded to aces. These landing locations provide empirically derived target locations for players to adhere to during practice and competition.
Collapse
Affiliation(s)
- David Whiteside
- a Game Insight Group , Tennis Australia , Melbourne , Australia.,b Institute of Sport, Exercise and Active Living , Victoria University , Melbourne , Australia
| | - Machar Reid
- a Game Insight Group , Tennis Australia , Melbourne , Australia.,c School of Sport Science, Exercise and Health , University of Western Australia , Crawley , Australia
| |
Collapse
|
15
|
|
16
|
|
17
|
Robnik-Šikonja M, Kononenko I, Štrumbelj E. Quality of classification explanations with PRBF. Neurocomputing 2012. [DOI: 10.1016/j.neucom.2011.10.038] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
18
|
Huynh TQ, Reggia JA. Symbolic representation of recurrent neural network dynamics. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:1649-1658. [PMID: 24808009 DOI: 10.1109/tnnls.2012.2210242] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Simple recurrent error backpropagation networks have been widely used to learn temporal sequence data, including regular and context-free languages. However, the production of relatively large and opaque weight matrices during learning has inspired substantial research on how to extract symbolic human-readable interpretations from trained networks. Unlike feedforward networks, where research has focused mainly on rule extraction, most past work with recurrent networks has viewed them as dynamical systems that can be approximated symbolically by finite-state machine (FSMs). With this approach, the network's hidden layer activation space is typically divided into a finite number of regions. Past research has mainly focused on better techniques for dividing up this activation space. In contrast, very little work has tried to influence the network training process to produce a better representation in hidden layer activation space, and that which has been done has had only limited success. Here we propose a powerful general technique to bias the error backpropagation training process so that it learns an activation space representation from which it is easier to extract FSMs. Using four publicly available data sets that are based on regular and context-free languages, we show via computational experiments that the modified learning method helps to extract FSMs with substantially fewer states and less variance than unmodified backpropagation learning, without decreasing the neural networks' accuracy. We conclude that modifying error backpropagation so that it more effectively separates learned pattern encodings in the hidden layer is an effective way to improve contemporary FSM extraction methods.
Collapse
|
19
|
Liu S, Patel RY, Daga PR, Liu H, Fu G, Doerksen RJ, Chen Y, Wilkins DE. Combined rule extraction and feature elimination in supervised classification. IEEE Trans Nanobioscience 2012; 11:228-36. [PMID: 22987128 PMCID: PMC6295448 DOI: 10.1109/tnb.2012.2213264] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.
Collapse
Affiliation(s)
- Sheng Liu
- Department of Computer and Information Science, University of Mississippi, University, MS 38677, USA.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
|
21
|
Augasta MG, Kathirvalavakumar T. Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems. Neural Process Lett 2011. [DOI: 10.1007/s11063-011-9207-8] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Al-Shyoukh I, Yu F, Feng J, Yan K, Dubinett S, Ho CM, Shamma JS, Sun R. Systematic quantitative characterization of cellular responses induced by multiple signals. BMC SYSTEMS BIOLOGY 2011; 5:88. [PMID: 21624115 PMCID: PMC3138445 DOI: 10.1186/1752-0509-5-88] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2010] [Accepted: 05/30/2011] [Indexed: 11/15/2022]
Abstract
Background Cells constantly sense many internal and environmental signals and respond through their complex signaling network, leading to particular biological outcomes. However, a systematic characterization and optimization of multi-signal responses remains a pressing challenge to traditional experimental approaches due to the arising complexity associated with the increasing number of signals and their intensities. Results We established and validated a data-driven mathematical approach to systematically characterize signal-response relationships. Our results demonstrate how mathematical learning algorithms can enable systematic characterization of multi-signal induced biological activities. The proposed approach enables identification of input combinations that can result in desired biological responses. In retrospect, the results show that, unlike a single drug, a properly chosen combination of drugs can lead to a significant difference in the responses of different cell types, increasing the differential targeting of certain combinations. The successful validation of identified combinations demonstrates the power of this approach. Moreover, the approach enables examining the efficacy of all lower order mixtures of the tested signals. The approach also enables identification of system-level signaling interactions between the applied signals. Many of the signaling interactions identified were consistent with the literature, and other unknown interactions emerged. Conclusions This approach can facilitate development of systems biology and optimal drug combination therapies for cancer and other diseases and for understanding key interactions within the cellular network upon treatment with multiple signals.
Collapse
Affiliation(s)
- Ibrahim Al-Shyoukh
- Department of Molecular and Medical Pharmacology, University of California at Los Angeles, Los Angeles, CA 90095, USA
| | | | | | | | | | | | | | | |
Collapse
|
23
|
A new data mining scheme using artificial neural networks. SENSORS 2011; 11:4622-47. [PMID: 22163866 PMCID: PMC3231400 DOI: 10.3390/s110504622] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2011] [Revised: 04/11/2011] [Accepted: 04/14/2011] [Indexed: 11/16/2022]
Abstract
Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems.
Collapse
|
24
|
Ao S, Palade V. Ensemble of Elman neural networks and support vector machines for reverse engineering of gene regulatory networks. Appl Soft Comput 2011. [DOI: 10.1016/j.asoc.2010.05.014] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
25
|
Huynh TQ, Reggia JA. Guiding hidden layer representations for improved rule extraction from neural networks. ACTA ACUST UNITED AC 2010; 22:264-75. [PMID: 21138801 DOI: 10.1109/tnn.2010.2094205] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The production of relatively large and opaque weight matrices by error backpropagation learning has inspired substantial research on how to extract symbolic human-readable rules from trained networks. While considerable progress has been made, the results at present are still relatively limited, in part due to the large numbers of symbolic rules that can be generated. Most past work to address this issue has focused on progressively more powerful methods for rule extraction (RE) that try to minimize the number of weights and/or improve rule expressiveness. In contrast, here we take a different approach in which we modify the error backpropagation training process so that it learns a different hidden layer representation of input patterns than would normally occur. Using five publicly available datasets, we show via computational experiments that the modified learning method helps to extract fewer rules without increasing individual rule complexity and without decreasing classification accuracy. We conclude that modifying error backpropagation so that it more effectively separates learned pattern encodings in the hidden layer is an effective way to improve contemporary RE methods.
Collapse
Affiliation(s)
- Thuan Q Huynh
- Department of Computer Science, Universityof Maryland, College Park, MD 20742, USA.
| | | |
Collapse
|
26
|
Freitas AA, Wieser DC, Apweiler R. On the importance of comprehensible classification models for protein function prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010; 7:172-182. [PMID: 20150679 DOI: 10.1109/tcbb.2008.47] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The literature on protein function prediction is currently dominated by works aimed at maximizing predictive accuracy, ignoring the important issues of validation and interpretation of discovered knowledge, which can lead to new insights and hypotheses that are biologically meaningful and advance the understanding of protein functions by biologists. The overall goal of this paper is to critically evaluate this approach, offering a refreshing new perspective on this issue, focusing not only on predictive accuracy but also on the comprehensibility of the induced protein function prediction models. More specifically, this paper aims to offer two main contributions to the area of protein function prediction. First, it presents the case for discovering comprehensible protein function prediction models from data, discussing in detail the advantages of such models, namely, increasing the confidence of the biologist in the system's predictions, leading to new insights about the data and the formulation of new biological hypotheses, and detecting errors in the data. Second, it presents a critical review of the pros and cons of several different knowledge representations that can be used in order to support the discovery of comprehensible protein function prediction models.
Collapse
Affiliation(s)
- Alex A Freitas
- Computing Laboratory, University of Kent, Canterbury, UK.
| | | | | |
Collapse
|
27
|
Hengjie S, Chunyan M, Zhiqi S, Yuan M, Lee BS. A fuzzy neural network with fuzzy impact grades. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2009.03.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
28
|
Starzyk J, He H. Spatio–Temporal Memories for Machine Learning: A Long-Term Memory Organization. ACTA ACUST UNITED AC 2009; 20:768-80. [DOI: 10.1109/tnn.2009.2012854] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
29
|
Evolving rule induction algorithms with multi-objective grammar-based genetic programming. Knowl Inf Syst 2008. [DOI: 10.1007/s10115-008-0171-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
30
|
|
31
|
Kolman E, Margaliot M. A new approach to knowledge-based design of recurrent neural networks. IEEE TRANSACTIONS ON NEURAL NETWORKS 2008; 19:1389-401. [PMID: 18701369 DOI: 10.1109/tnn.2008.2000393] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A major drawback of artificial neural networks (ANNs) is their black-box character. This is especially true for recurrent neural networks (RNNs) because of their intricate feedback connections. In particular, given a problem and some initial information concerning its solution, it is not at all obvious how to design an RNN that is suitable for solving this problem. In this paper, we consider a fuzzy rule base with a special structure, referred to as the fuzzy all-permutations rule base (FARB). Inferring the FARB yields an input-output (IO) mapping that is mathematically equivalent to that of an RNN. We use this equivalence to develop two new knowledge-based design methods for RNNs. The first method, referred to as the direct approach, is based on stating the desired functioning of the RNN in terms of several sets of symbolic rules, each one corresponding to a subnetwork. Each set is then transformed into a suitable FARB. The second method is based on first using the direct approach to design a library of simple modules, such as counters or comparators, and realize them using RNNs. Once designed, the correctness of each RNN can be verified. Then, the initial design problem is solved by using these basic modules as building blocks. This yields a modular and systematic approach for knowledge-based design of RNNs. We demonstrate the efficiency of these approaches by designing RNNs that recognize both regular and nonregular formal languages.
Collapse
Affiliation(s)
- Eyal Kolman
- School of Electrical Engineering-Systems, Tel Aviv University, Tel Aviv 69978, Israel.
| | | |
Collapse
|
32
|
Cartling B. On the implicit acquisition of a context-free grammar by a simple recurrent neural network. Neurocomputing 2008. [DOI: 10.1016/j.neucom.2007.05.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
33
|
A generic fuzzy aggregation operator: rules extraction from and insertion into artificial neural networks. Soft comput 2007. [DOI: 10.1007/s00500-007-0221-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
34
|
|
35
|
Abstract
This letter presents an algorithm, CrySSMEx, for extracting minimal finite state machine descriptions of dynamic systems such as recurrent neural networks. Unlike previous algorithms, CrySSMEx is parameter free and deterministic, and it efficiently generates a series of increasingly refined models. A novel finite stochastic model of dynamic systems and a novel vector quantization function have been developed to take into account the state-space dynamics of the system. The experiments show that (1) extraction from systems that can be described as regular grammars is trivial, (2) extraction from high-dimensional systems is feasible, and (3) extraction of approximative models from chaotic systems is possible. The results are promising, and an analysis of shortcomings suggests some possible further improvements. Some largely overlooked connections, of the field of rule extraction from recurrent neural networks, to other fields are also identified.
Collapse
|
36
|
Abstract
We investigate possibilities of inducing temporal structures without fading memory in recurrent networks of spiking neurons strictly operating in the pulse-coding regime. We extend the existing gradient-based algorithm for training feedforward spiking neuron networks, SpikeProp (Bohte, Kok, & La Poutré, 2002), to recurrent network topologies, so that temporal dependencies in the input stream are taken into account. It is shown that temporal structures with unbounded input memory specified by simple Moore machines (MM) can be induced by recurrent spiking neuron networks (RSNN). The networks are able to discover pulse-coded representations of abstract information processing states coding potentially unbounded histories of processed inputs. We show that it is often possible to extract from trained RSNN the target MM by grouping together similar spike trains appearing in the recurrent layer. Even when the target MM was not perfectly induced in a RSNN, the extraction procedure was able to reveal weaknesses of the induced mechanism and the extent to which the target machine had been learned.
Collapse
Affiliation(s)
| | - Ashely J. S. Mills
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
| |
Collapse
|