201
|
Neural correlates of variations in event processing during learning in basolateral amygdala. J Neurosci 2010; 30:2464-71. [PMID: 20164330 DOI: 10.1523/jneurosci.5781-09.2010] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The discovery that dopamine neurons signal errors in reward prediction has demonstrated that concepts empirically derived from the study of animal behavior can be used to understand the neural implementation of reward learning. Yet the learning theory models linked to phasic dopamine activity treat attention to events such as cues and rewards as static quantities; other models, such as Pearce-Hall, propose that learning might be influenced by variations in processing of these events. A key feature of these accounts is that event processing is modulated by unsigned rather than signed reward prediction errors. Here we tested whether neural activity in rat basolateral amygdala conforms to this pattern by recording single units in a behavioral task in which rewards were unexpectedly delivered or omitted. We report that neural activity at the time of reward is providing an unsigned error signal with characteristics consistent with those postulated by these models. This neural signal increased immediately after a change in reward, and stronger firing was evident whether the value of the reward increased or decreased. Further, as predicted by these models, the change in firing developed over several trials as expectations for reward were repeatedly violated. This neural signal was correlated with faster orienting to predictive cues after changes in reward, and abolition of the signal by inactivation of basolateral amygdala disrupted this change in orienting and retarded learning in response to changes in reward. These results suggest that basolateral amygdala serves a critical function in attention for learning.
Collapse
|
202
|
Matzel LD, Kolata S. Selective attention, working memory, and animal intelligence. Neurosci Biobehav Rev 2010; 34:23-30. [PMID: 19607858 PMCID: PMC2784289 DOI: 10.1016/j.neubiorev.2009.07.002] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2009] [Revised: 06/11/2009] [Accepted: 07/06/2009] [Indexed: 11/15/2022]
Abstract
Accumulating evidence indicates that the storage and processing capabilities of the human working memory system co-vary with individuals' performance on a wide range of cognitive tasks. The ubiquitous nature of this relationship suggests that variations in these processes may underlie individual differences in intelligence. Here we briefly review relevant data which supports this view. Furthermore, we emphasize an emerging literature describing a trait in genetically heterogeneous mice that is quantitatively and qualitatively analogous to general intelligence (g) in humans. As in humans, this animal analog of g co-varies with individual differences in both storage and processing components of the working memory system. Absent some of the complications associated with work with human subjects (e.g., phonological processing), this work with laboratory animals has provided an opportunity to assess otherwise intractable hypotheses. For instance, it has been possible in animals to manipulate individual aspects of the working memory system (e.g., selective attention), and to observe causal relationships between these variables and the expression of general cognitive abilities. This work with laboratory animals has coincided with human imaging studies (briefly reviewed here) which suggest that common brain structures (e.g., prefrontal cortex) mediate the efficacy of selective attention and the performance of individuals on intelligence test batteries. In total, this evidence suggests an evolutionary conservation of the processes that co-vary with and/or regulate "intelligence" and provides a framework for promoting these abilities in both young and old animals.
Collapse
Affiliation(s)
- Louis D Matzel
- Department of Psychology, Program in Behavioral Neuroscience, Rutgers University, Piscataway, NJ 08854, USA.
| | | |
Collapse
|
203
|
Shibata K, Yamagishi N, Ishii S, Kawato M. Boosting perceptual learning by fake feedback. Vision Res 2009; 49:2574-85. [DOI: 10.1016/j.visres.2009.06.009] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2008] [Revised: 06/05/2009] [Accepted: 06/09/2009] [Indexed: 11/15/2022]
|
204
|
Abstract
Perceptual decisions require the brain to weigh noisy evidence from sensory neurons to form categorical judgments that guide behavior. Here we review behavioral and neurophysiological findings suggesting that at least some forms of perceptual learning do not appear to affect the response properties of neurons that represent the sensory evidence. Instead, improved perceptual performance results from changes in how the sensory evidence is selected and weighed to form the decision. We discuss the implications of this idea for possible sites and mechanisms of training-induced improvements in perceptual processing in the brain.
Collapse
Affiliation(s)
- Chi-Tat Law
- Department of Neuroscience, University of Pennsylvania
| | | |
Collapse
|
205
|
Abstract
Humans display more conditioned fear when the conditioned stimulus in a fear conditioning paradigm is a picture of an individual from another race than when it is a picture of an individual from their own race (Olsson, Ebert, Banaji, & Phelps, 2005). These results have been interpreted in terms of a genetic "preparedness" to learn to fear individuals from different social groups (Ohman, 2005; Olsson et al., 2005). However, the associability of conditioned stimuli is strongly influenced by prior exposure to those or similar stimuli. Using the Kalman filter, a normative statistical model, this article shows that superior fear conditioning to individuals from other groups is precisely what one would expect if participants perform optimal, Bayesian inference that takes their prior exposures to the different groups into account. There is therefore no need to postulate a genetic preparedness to learn to fear individuals from other races or social groups.
Collapse
|
206
|
Yu AJ, Dayan P, Cohen JD. Dynamics of attentional selection under conflict: toward a rational Bayesian account. J Exp Psychol Hum Percept Perform 2009; 35:700-17. [PMID: 19485686 DOI: 10.1037/a0013553] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The brain exhibits remarkable facility in exerting attentional control in most circumstances, but it also suffers apparent limitations in others. The authors' goal is to construct a rational account for why attentional control appears suboptimal under conditions of conflict and what this implies about the underlying computational principles. The formal framework used is based on Bayesian probability theory, which provides a convenient language for delineating the rationale and dynamics of attentional selection. The authors illustrate these issues with the Eriksen flanker task, a classical paradigm that explores the effects of competing sensory inputs on response tendencies. The authors show how 2 distinctly formulated models, based on compatibility bias and spatial uncertainty principles, can account for the behavioral data. They also suggest novel experiments that may differentiate these models. In addition, they elaborate a simplified model that approximates optimal computation and may map more directly onto the underlying neural machinery. This approximate model uses conflict monitoring, putatively mediated by the anterior cingulate cortex, as a proxy for compatibility representation. The authors also consider how this conflict information might be disseminated and used to control processing.
Collapse
Affiliation(s)
- Angela J Yu
- Center for the Study of Brain, Mind, and Behavior, Princeton University, USA.
| | | | | |
Collapse
|
207
|
Osorio D, Ham AD, Gonda Z, Andrew RJ. Sensory generalization and learning about novel colours by poultry chicks. Q J Exp Psychol (Hove) 2009; 62:1249-56. [PMID: 19235098 DOI: 10.1080/17470210802671305] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
In nature animals constantly encounter novel stimuli and need to generalize from known stimuli. The animal may then learn about the novel stimulus. Hull (1947) suggested that as they learn animals distinguish knowledge based on direct experience from inference by generalization and in support of this view suggested that if a subject is directly trained to a stimulus subsequent extinction of responses is slower than when the response is based on generalization. Such an effect is also predicted by Bayesian models that relate the rate of learning to uncertainty in the estimate of stimulus value. We find support for this prediction when chicks learn about a novel colour (orange) if the initial evaluation is based on similarity to known colours (red, yellow). Specifically, if an expected food reward is absent the rate of extinction of the response to the novel stimulus exceeds that for the familiar colours. Interestingly, the change in relative preference for novel and familiar stimuli occurs after a delay of an hour. This type of delay has not, to our knowledge, been reported in previous studies of single-trial learning, but given its importance of generalization in natural behaviour this type of learning may have wide relevance.
Collapse
Affiliation(s)
- Daniel Osorio
- School of Life Sciences, University of Sussex, Brighton, UK.
| | | | | | | |
Collapse
|
208
|
Chapter 5 Highlighting. PSYCHOLOGY OF LEARNING AND MOTIVATION 2009. [DOI: 10.1016/s0079-7421(09)51005-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register]
|
209
|
Behrens TEJ, Hunt LT, Woolrich MW, Rushworth MFS. Associative learning of social value. Nature 2008; 456:245-9. [PMID: 19005555 PMCID: PMC2605577 DOI: 10.1038/nature07538] [Citation(s) in RCA: 619] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2008] [Accepted: 10/14/2008] [Indexed: 11/09/2022]
Abstract
Our decisions are guided by information learnt from our environment. This information may come via personal experiences of reward, but also from the behaviour of social partners1, 2. Social learning is widely held to be distinct from other forms of learning in its mechanism and neural implementation; it is often assumed to compete with simpler mechanisms, such as reward-based associative learning, to drive behaviour3. Recently however, neural signals have been observed during social exchange reminiscent of signals seen in associative paradigms4. Here, we demonstrate that social information may be acquired using the same associative processes assumed to underlie reward-based learning. We find that key computational variables for learning in the social and reward domains are processed in a similar fashion, but in parallel neural processing streams. Two neighbouring divisions of the anterior cingulate cortex were central to learning about social and reward-based information, and for determining the extent to which each source of information guides behaviour. When making a decision, however, the information learnt using these parallel streams was combined within ventromedial prefrontal cortex. These findings suggest that human social valuation can be realised via the same associative processes previously established for learning other, simpler, features of the environment.
Collapse
Affiliation(s)
- Timothy E J Behrens
- FMRIB Centre, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK.
| | | | | | | |
Collapse
|
210
|
Hogarth L, Dickinson A, Janowski M, Nikitina A, Duka T. The role of attentional bias in mediating human drug-seeking behaviour. Psychopharmacology (Berl) 2008; 201:29-41. [PMID: 18679657 DOI: 10.1007/s00213-008-1244-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2008] [Accepted: 06/16/2008] [Indexed: 11/30/2022]
Abstract
RATIONALE The attentional bias for drug cues is believed to be a causal cognitive process mediating human drug seeking and relapse. OBJECTIVES, METHODS AND RESULTS To test this claim, we trained smokers on a tobacco conditioning procedure in which the conditioned stimulus (or S+) acquired parallel control of an attentional bias (measured with an eye tracker), tobacco expectancy and instrumental tobacco-seeking behaviour. Although this correlation between measures may be regarded as consistent with the claim that the attentional bias for the S+ mediated tobacco seeking, when a secondary task was added in the test phase, the attentional bias for the S+ was abolished, yet the control of tobacco expectancy and tobacco seeking remained intact. CONCLUSIONS This dissociation suggests that the attentional bias for drug cues is not necessary for the control that drug cues exert over drug-seeking behaviour. The question raised by these data is what function does the attentional bias serve if it does not mediate drug seeking?
Collapse
Affiliation(s)
- Lee Hogarth
- School of Psychology, University of Nottingham, University Park, Nottingham, UK.
| | | | | | | | | |
Collapse
|
211
|
Hogarth L, Dickinson A, Austin A, Brown C, Duka T. Attention and expectation in human predictive learning: The role of uncertainty. Q J Exp Psychol (Hove) 2008; 61:1658-68. [DOI: 10.1080/17470210701643439] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Three localized, visual pattern stimuli were trained as predictive signals of auditory outcomes. One signal partially predicted an aversive noise in Experiment 1 and a neutral tone in Experiment 2, whereas the other signals consistently predicted either the occurrence or absence of the noise. The expectation of the noise was measured during each signal presentation, and only participants for whom this expectation demonstrated contingency knowledge showed differential attention to the signals. Importantly, when attention was measured by visual fixations, the contingency-aware group attended more to the partially predictive signal than to the consistent predictors in both experiments. This profile of visual attention supports the Pearce and Hall (1980) theory of the role of attention in associative learning.
Collapse
Affiliation(s)
| | | | - Alison Austin
- School of Life Sciences, University of Sussex, Brighton, UK
| | - Craig Brown
- School of Life Sciences, University of Sussex, Brighton, UK
| | - Theodora Duka
- School of Life Sciences, University of Sussex, Brighton, UK
| |
Collapse
|
212
|
Neural correlates, computation and behavioural impact of decision confidence. Nature 2008; 455:227-31. [PMID: 18690210 DOI: 10.1038/nature07200] [Citation(s) in RCA: 524] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2008] [Accepted: 06/26/2008] [Indexed: 11/08/2022]
|
213
|
Redish AD, Jensen S, Johnson A. A unified framework for addiction: vulnerabilities in the decision process. Behav Brain Sci 2008; 31:415-37; discussion 437-87. [PMID: 18662461 PMCID: PMC3774323 DOI: 10.1017/s0140525x0800472x] [Citation(s) in RCA: 303] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The understanding of decision-making systems has come together in recent years to form a unified theory of decision-making in the mammalian brain as arising from multiple, interacting systems (a planning system, a habit system, and a situation-recognition system). This unified decision-making system has multiple potential access points through which it can be driven to make maladaptive choices, particularly choices that entail seeking of certain drugs or behaviors. We identify 10 key vulnerabilities in the system: (1) moving away from homeostasis, (2) changing allostatic set points, (3) euphorigenic "reward-like" signals, (4) overvaluation in the planning system, (5) incorrect search of situation-action-outcome relationships, (6) misclassification of situations, (7) overvaluation in the habit system, (8) a mismatch in the balance of the two decision systems, (9) over-fast discounting processes, and (10) changed learning rates. These vulnerabilities provide a taxonomy of potential problems with decision-making systems. Although each vulnerability can drive an agent to return to the addictive choice, each vulnerability also implies a characteristic symptomology. Different drugs, different behaviors, and different individuals are likely to access different vulnerabilities. This has implications for an individual's susceptibility to addiction and the transition to addiction, for the potential for relapse, and for the potential for treatment.
Collapse
Affiliation(s)
- A. David Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, , http://umn.edu/~redish/
| | - Steve Jensen
- Graduate Program in Computer Science, University of Minnesota, Minneapolis, MN 55455,
| | - Adam Johnson
- Graduate Program in Neuroscience and Center for Cognitive Sciences, University of Minnesota, Minneapolis, MN 55455,
| |
Collapse
|
214
|
Brown CA, Seymour B, Boyle Y, El-Deredy W, Jones AKP. Modulation of pain ratings by expectation and uncertainty: Behavioral characteristics and anticipatory neural correlates. Pain 2008; 135:240-250. [PMID: 17614199 DOI: 10.1016/j.pain.2007.05.022] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Revised: 05/30/2007] [Accepted: 05/31/2007] [Indexed: 11/27/2022]
Abstract
Expectations about the magnitude of impending pain exert a substantial effect on subsequent perception. However, the neural mechanisms that underlie the predictive processes that modulate pain are poorly understood. In a combined behavioral and high-density electrophysiological study we measured anticipatory neural responses to heat stimuli to determine how predictions of pain intensity, and certainty about those predictions, modulate brain activity and subjective pain ratings. Prior to receiving randomized laser heat stimuli at different intensities (low, medium or high) subjects (n=15) viewed cues that either accurately informed them of forthcoming intensity (certain expectation) or not (uncertain expectation). Pain ratings were biased towards prior expectations of either high or low intensity. Anticipatory neural responses increased with expectations of painful vs. non-painful heat intensity, suggesting the presence of neural responses that represent predicted heat stimulus intensity. These anticipatory responses also correlated with the amplitude of the Laser-Evoked Potential (LEP) response to painful stimuli when the intensity was predictable. Source analysis (LORETA) revealed that uncertainty about expected heat intensity involves an anticipatory cortical network commonly associated with attention (left dorsolateral prefrontal, posterior cingulate and bilateral inferior parietal cortices). Relative certainty, however, involves cortical areas previously associated with semantic and prospective memory (left inferior frontal and inferior temporal cortex, and right anterior prefrontal cortex). This suggests that biasing of pain reports and LEPs by expectation involves temporally precise activity in specific cortical networks.
Collapse
Affiliation(s)
- Christopher A Brown
- Human Pain Research Group, Clinical Sciences Building, Hope Hospital, Salford M6 8HD, United Kingdom Wellcome Department of Imaging Neuroscience, Functional Imaging Laboratory, 12 Queen Square, London WC1N 3BG, United Kingdom School of Psychological Sciences, University of Manchester, Zochonis Building, Oxford Road, Manchester M13 9PL, United Kingdom
| | | | | | | | | |
Collapse
|
215
|
Rushworth MFS, Behrens TEJ. Choice, uncertainty and value in prefrontal and cingulate cortex. Nat Neurosci 2008; 11:389-97. [PMID: 18368045 DOI: 10.1038/nn2066] [Citation(s) in RCA: 573] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Reinforcement learning models that focus on the striatum and dopamine can predict the choices of animals and people. Representations of reward expectation and of reward prediction errors that are pertinent to decision making, however, are not confined to these regions but are also found in prefrontal and cingulate cortex. Moreover, decisions are not guided solely by the magnitude of the reward that is expected. Uncertainty in the estimate of the reward expectation, the value of information that might be gained by taking a course of action and the cost of an action all influence the manner in which decisions are made through prefrontal and cingulate cortex.
Collapse
Affiliation(s)
- Matthew F S Rushworth
- Department of Experimental Psychology and Centre for Functional MRI of the Brain, John Radcliffe Hospital, University of Oxford, South Parks Road, Oxford OX1 3UD, UK.
| | | |
Collapse
|
216
|
Temporal association between food distribution and human caregiver presence and the development of affinity to humans in lambs. Dev Psychobiol 2008; 50:147-59. [DOI: 10.1002/dev.20254] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
217
|
Hester R, Barre N, Murphy K, Silk TJ, Mattingley JB. Human Medial Frontal Cortex Activity Predicts Learning from Errors. Cereb Cortex 2007; 18:1933-40. [DOI: 10.1093/cercor/bhm219] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
218
|
|
219
|
Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci 2007; 10:1214-21. [PMID: 17676057 DOI: 10.1038/nn1954] [Citation(s) in RCA: 1184] [Impact Index Per Article: 69.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Accepted: 06/05/2007] [Indexed: 11/09/2022]
Abstract
Our decisions are guided by outcomes that are associated with decisions made in the past. However, the amount of influence each past outcome has on our next decision remains unclear. To ensure optimal decision-making, the weight given to decision outcomes should reflect their salience in predicting future outcomes, and this salience should be modulated by the volatility of the reward environment. We show that human subjects assess volatility in an optimal manner and adjust decision-making accordingly. This optimal estimate of volatility is reflected in the fMRI signal in the anterior cingulate cortex (ACC) when each trial outcome is observed. When a new piece of information is witnessed, activity levels reflect its salience for predicting future outcomes. Furthermore, variations in this ACC signal across the population predict variations in subject learning rates. Our results provide a formal account of how we weigh our different experiences in guiding our future actions.
Collapse
Affiliation(s)
- Timothy E J Behrens
- FMRIB Centre, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK.
| | | | | | | |
Collapse
|
220
|
Pauli WM, O'Reilly RC. Attentional control of associative learning--a possible role of the central cholinergic system. Brain Res 2007; 1202:43-53. [PMID: 17870060 PMCID: PMC3010195 DOI: 10.1016/j.brainres.2007.06.097] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2007] [Accepted: 06/09/2007] [Indexed: 10/23/2022]
Abstract
How does attention interact with learning? Kruschke [Kruschke, J.K. (2001). Toward a unified Model of Attention in Associative Learning. J. Math. Psychol. 45, 812-863.] proposed a model (EXIT) that captures Mackintosh's [Mackintosh, N.J. (1975). A theory of attention: Variations in the associability of stimuli with reinforcement. Psychological Review, 82(4), 276-298.] framework for attentional modulation of associative learning. We developed a computational model that showed analogous interactions between selective attention and associative learning, but is significantly simplified and, in contrast to EXIT, is motivated by neurophysiological findings. Competition among input representations in the internal representation layer, which increases the contrast between stimuli, is critical for simulating these interactions in human behavior. Furthermore, this competition is modulated in a way that might be consistent with the phasic activation of the central cholinergic system, which modulates activity in sensory cortices. Specifically, phasic increases in acetylcholine can cause increased excitability of both pyramidal excitatory neurons in cortical layers II/III and cortical GABAergic inhibitory interneurons targeting the same pyramidal neurons. These effects result in increased attentional contrast in our model. This model thus represents an initial attempt to link human attentional learning data with underlying neural substrates.
Collapse
Affiliation(s)
- Wolfgang M Pauli
- Department of Psychology, Muenzinger Psychology Building, University of Colorado Boulder, 345 UCB, Boulder, CO 80309-0345, USA.
| | | |
Collapse
|
221
|
Jaramillo S, Pearlmutter BA. Optimal coding predicts attentional modulation of activity in neural systems. Neural Comput 2007; 19:1295-312. [PMID: 17381267 DOI: 10.1162/neco.2007.19.5.1295] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Neuronal activity in response to a fixed stimulus has been shown to change as a function of attentional state, implying that the neural code also changes with attention. We propose an information-theoretic account of such modulation: that the nervous system adapts to optimally encode sensory stimuli while taking into account the changing relevance of different features. We show using computer simulation that such modulation emerges in a coding system informed about the uneven relevance of the input features. We present a simple feedforward model that learns a covert attention mechanism, given input patterns and coding fidelity requirements. After optimization, the system gains the ability to reorganize its computational resources (and coding strategy) depending on the incoming attentional signal, without the need of multiplicative interaction or explicit gating mechanisms between units. The modulation of activity for different attentional states matches that observed in a variety of selective attention experiments. This model predicts that the shape of the attentional modulation function can be strongly stimulus dependent. The general principle presented here accounts for attentional modulation of neural activity without relying on special-purpose architectural mechanisms dedicated to attention. This principle applies to different attentional goals, and its implications are relevant for all modalities in which attentional phenomena are observed.
Collapse
|
222
|
Corlett PR, Honey GD, Fletcher PC. From prediction error to psychosis: ketamine as a pharmacological model of delusions. J Psychopharmacol 2007; 21:238-52. [PMID: 17591652 DOI: 10.1177/0269881107077716] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recent cognitive neuropsychiatric models of psychosis emphasize the role of attentional disturbances and inappropriate incentive learning in the development of delusions. These models highlight a pre-psychotic period in which the patient experiences perceptual and attentional disruptions. Irrelevant details and numerous associations between stimuli, thoughts and percepts are imbued with inappropriate significance and the attempt to rationalize and account for these bizarre experiences results in the formation of delusions. The present paper discusses delusion formation in terms of basic associative learning processes. Such processes are driven by prediction error signals. Prediction error refers to mismatches between an organism's expectation in a given environment and what actually happens and it is signalled by both dopaminergic and glutamatergic mechanisms. Disruption of these neurobiological systems may underlie delusion formation. We review similarities between acute psychosis and the psychotic state induced by the NMDA receptor antagonist drug ketamine, which impacts upon both dopaminergic and glutamatergic function. We conclude by suggesting that ketamine may provide an appropriate model to investigate the formative stages of symptom evolution in schizophrenia, and thereby provide a window into the earliest and otherwise inaccessible aspects of the disease process.
Collapse
Affiliation(s)
- P R Corlett
- Brain Mapping Unit, Department of Psychiatry, University of Cambridge, School of Clinical Medicine, Addenbrooke's Hospital, Hills Road, Cambridge, UK
| | | | | |
Collapse
|
223
|
Lapish CC, Kroener S, Durstewitz D, Lavin A, Seamans JK. The ability of the mesocortical dopamine system to operate in distinct temporal modes. Psychopharmacology (Berl) 2007; 191:609-25. [PMID: 17086392 PMCID: PMC5509053 DOI: 10.1007/s00213-006-0527-8] [Citation(s) in RCA: 107] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2006] [Accepted: 07/16/2006] [Indexed: 10/24/2022]
Abstract
BACKGROUND This review discusses evidence that cells in the mesocortical dopamine (DA) system influence information processing in target areas across three distinct temporal domains. DISCUSSIONS Phasic bursting of midbrain DA neurons may provide temporally precise information about the mismatch between expected and actual rewards (prediction errors) that has been hypothesized to serve as a learning signal in efferent regions. However, because DA acts as a relatively slow modulator of cortical neurotransmission, it is unclear whether DA can indeed act to precisely transmit prediction errors to prefrontal cortex (PFC). In light of recent physiological and anatomical evidence, we propose that corelease of glutamate from DA and/or non-DA neurons in the VTA could serve to transmit this temporally precise signal. In contrast, DA acts in a protracted manner to provide spatially and temporally diffuse modulation of PFC pyramidal neurons and interneurons. This modulation occurs first via a relatively rapid depolarization of fast-spiking interneurons that acts on the order of seconds. This is followed by a more protracted modulation of a variety of other ionic currents on timescales of minutes to hours, which may bias the manner in which cortical networks process information. However, the prolonged actions of DA may be curtailed by counteracting influences, which likely include opposing actions at D1 and D2-like receptors that have been shown to be time- and concentration-dependent. In this way, the mesocortical DA system optimizes the characteristics of glutamate, GABA, and DA neurotransmission both within the midbrain and cortex to communicate temporally precise information and to modulate network activity patterns on prolonged timescales.
Collapse
Affiliation(s)
- Christopher C Lapish
- Department of Neurosciences, Medical University of South Carolina, Suite 430 BSB 173 Ashley, Charleston, SC, USA.
| | | | | | | | | |
Collapse
|
224
|
Bogacz R, McClure SM, Li J, Cohen JD, Montague PR. Short-term memory traces for action bias in human reinforcement learning. Brain Res 2007; 1153:111-21. [PMID: 17459346 DOI: 10.1016/j.brainres.2007.03.057] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2006] [Revised: 03/14/2007] [Accepted: 03/15/2007] [Indexed: 10/23/2022]
Abstract
Recent experimental and theoretical work on reinforcement learning has shed light on the neural bases of learning from rewards and punishments. One fundamental problem in reinforcement learning is the credit assignment problem, or how to properly assign credit to actions that lead to reward or punishment following a delay. Temporal difference learning solves this problem, but its efficiency can be significantly improved by the addition of eligibility traces (ET). In essence, ETs function as decaying memories of previous choices that are used to scale synaptic weight changes. It has been shown in theoretical studies that ETs spanning a number of actions may improve the performance of reinforcement learning. However, it remains an open question whether including ETs that persist over sequences of actions allows reinforcement learning models to better fit empirical data regarding the behaviors of humans and other animals. Here, we report an experiment in which human subjects performed a sequential economic decision game in which the long-term optimal strategy differed from the strategy that leads to the greatest short-term return. We demonstrate that human subjects' performance in the task is significantly affected by the time between choices in a surprising and seemingly counterintuitive way. However, this behavior is naturally explained by a temporal difference learning model which includes ETs persisting across actions. Furthermore, we review recent findings that suggest that short-term synaptic plasticity in dopamine neurons may provide a realistic biophysical mechanism for producing ETs that persist on a timescale consistent with behavioral observations.
Collapse
Affiliation(s)
- Rafal Bogacz
- Center for the Study of Brain, Mind and Behavior, Princeton University, Princeton, NJ 08544, USA.
| | | | | | | | | |
Collapse
|
225
|
Kolata S, Light K, Grossman HC, Hale G, Matzel LD. Selective attention is a primary determinant of the relationship between working memory and general learning ability in outbred mice. Learn Mem 2007; 14:22-8. [PMID: 17272650 PMCID: PMC1838542 DOI: 10.1101/lm.408507] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A single factor (i.e., general intelligence) can account for much of an individuals' performance across a wide variety of cognitive tests. However, despite this factor's robustness, the underlying process is still a matter of debate. To address this question, we developed a novel battery of learning tasks to assess the general learning abilities (GLAs) of mice. Using this battery, we previously reported a strong relationship between GLA and a task designed to tax working memory capacity (i.e., resistance to competing demands). Here we further explored this relationship by investigating which aspects of working memory (storage or processing) best predict GLAs in mice. We found that a component of working memory, selective attention, correlated with GLA comparably to working memory capacity. However, this relationship was not found for two other components of working memory, short-term memory capacity and duration. These results provide further evidence that variations in aspects of working memory and executive functions covary with general cognitive abilities.
Collapse
Affiliation(s)
- Stefan Kolata
- Department of Psychology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Kenneth Light
- Department of Psychology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Henya C. Grossman
- Department of Psychology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Gregory Hale
- Department of Psychology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Louis D. Matzel
- Department of Psychology, Rutgers University, Piscataway, New Jersey 08854, USA
- Corresponding author.E-mail ; fax (732) 445-2263
| |
Collapse
|
226
|
Smith AC, Wirth S, Suzuki WA, Brown EN. Bayesian analysis of interleaved learning and response bias in behavioral experiments. J Neurophysiol 2006; 97:2516-24. [PMID: 17182907 DOI: 10.1152/jn.00946.2006] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Accurate characterizations of behavior during learning experiments are essential for understanding the neural bases of learning. Whereas learning experiments often give subjects multiple tasks to learn simultaneously, most analyze subject performance separately on each individual task. This analysis strategy ignores the true interleaved presentation order of the tasks and cannot distinguish learning behavior from response preferences that may represent a subject's biases or strategies. We present a Bayesian analysis of a state-space model for characterizing simultaneous learning of multiple tasks and for assessing behavioral biases in learning experiments with interleaved task presentations. Under the Bayesian analysis the posterior probability densities of the model parameters and the learning state are computed using Monte Carlo Markov Chain methods. Measures of learning, including the learning curve, the ideal observer curve, and the learning trial translate directly from our previous likelihood-based state-space model analyses. We compare the Bayesian and current likelihood-based approaches in the analysis of a simulated conditioned T-maze task and of an actual object-place association task. Modeling the interleaved learning feature of the experiments along with the animal's response sequences allows us to disambiguate actual learning from response biases. The implementation of the Bayesian analysis using the WinBUGS software provides an efficient way to test different models without developing a new algorithm for each model. The new state-space model and the Bayesian estimation procedure suggest an improved, computationally efficient approach for accurately characterizing learning in behavioral experiments.
Collapse
Affiliation(s)
- Anne C Smith
- Department of Anesthesiology and Pain Medicine, TB-170, One Shields Ave., University of California, Davis, CA 95616, USA.
| | | | | | | |
Collapse
|
227
|
Boudreau CE, Williford TH, Maunsell JHR. Effects of task difficulty and target likelihood in area V4 of macaque monkeys. J Neurophysiol 2006; 96:2377-87. [PMID: 16855106 DOI: 10.1152/jn.01072.2005] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Spatial attention improves performance at attended locations and correspondingly modulates firing rates of cortical neurons. The size of these behavioral and neuronal effects depends on the difficulty of the task performed at the attended location. Psychological theorists have attributed this to a tighter focus of a fixed amount of processing resource at the attended location, but the effects of task difficulty on the distribution of neuronal effects of attention across the visual field have not been fully explored. We trained rhesus monkeys to do a detection task in which difficulty and spatial attention were manipulated independently. Probe stimuli were used to measure behavioral performance in different conditions of attention and difficulty. Animals performed better at attended locations and this advantage increased with difficulty, consistent with data from human psychophysics. Neuronal modulation by spatial attention was larger with greater difficulty. In two animals, increasing difficulty caused a modest increase in neuronal responses to visual stimuli regardless of the locus of spatial attention. In a third animal, which was previously trained to ignore multiple distracting stimuli, increasing task difficulty increased responses at the focus of attention and suppressed responses away from the focus of attention. The results show that difficulty can modulate effects of spatial attention in V4; it can alter the distribution of sensory responses across the visual scene in ways that may depend on the subject's behavioral strategy.
Collapse
Affiliation(s)
- C Elizabeth Boudreau
- Howard Hughes Medical Institute, Baylor College of Medicine, Houston, Texas, USA
| | | | | |
Collapse
|
228
|
Abstract
To make a decision, a system must assign value to each of its available choices. In the human brain, one approach to studying valuation has used rewarding stimuli to map out brain responses by varying the dimension or importance of the rewards. However, theoretical models have taught us that value computations are complex, and so reward probes alone can give only partial information about neural responses related to valuation. In recent years, computationally principled models of value learning have been used in conjunction with noninvasive neuroimaging to tease out neural valuation responses related to reward-learning and decision-making. We restrict our review to the role of these models in a new generation of experiments that seeks to build on a now-large body of diverse reward-related brain responses. We show that the models and the measurements based on them point the way forward in two important directions: the valuation of time and the valuation of fictive experience.
Collapse
Affiliation(s)
- P Read Montague
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | |
Collapse
|
229
|
Dayan P, Niv Y, Seymour B, Daw ND. The misbehavior of value and the discipline of the will. Neural Netw 2006; 19:1153-60. [PMID: 16938432 DOI: 10.1016/j.neunet.2006.03.002] [Citation(s) in RCA: 202] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2005] [Accepted: 03/30/2006] [Indexed: 11/18/2022]
Abstract
Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas instrumental conditioning concerns action learning. However, it is only through Pavlovian responses that Pavlovian prediction learning is evident, and these responses can act against the instrumental interests of the subjects. This can be seen in both experimental and natural circumstances. In this paper we study the consequences of importing this competition into a reinforcement learning context, and demonstrate the resulting effects in an omission schedule and a maze navigation task. The misbehavior created by Pavlovian values can be quite debilitating; we discuss how it may be disciplined.
Collapse
Affiliation(s)
- Peter Dayan
- Gatsby Computational Neuroscience Unit, UCL, 17 Queen Square, London, UK.
| | | | | | | |
Collapse
|
230
|
Doeller CF, Opitz B, Krick CM, Mecklinger A, Reith W. Differential hippocampal and prefrontal-striatal contributions to instance-based and rule-based learning. Neuroimage 2006; 31:1802-16. [PMID: 16563803 DOI: 10.1016/j.neuroimage.2006.02.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2005] [Revised: 01/27/2006] [Accepted: 02/03/2006] [Indexed: 11/28/2022] Open
Abstract
It is a topic of current interest whether learning in humans relies on the acquisition of abstract rule knowledge (rule-based learning) or whether it depends on superficial item-specific information (instance-based learning). Here, we identified brain regions that mediate either of the two learning mechanisms by combining fMRI with an experimental protocol shown to be able to dissociate both learning mechanisms. Subjects had to learn object-position conjunctions in several trials and blocks. In a learning condition, either objects (Experiment 1) or positions (Experiment 2) were held constant within-blocks. In contrast to a control condition in which object-position conjunctions were trial-unique, a performance increase within and across-blocks was observed in the learning condition of both experiments. We hypothesized that within-block learning mainly relies on instance-based processes, whereas across-block learning might depend on rule-based mechanisms. A within-block parametric fMRI analysis revealed a learning-related increase of lateral prefrontal and striatal activity and a learning-related decrease of hippocampal activity in both experiments. By contrast, across-block learning was associated with an activation modulation in distinct prefrontal-striatal brain regions, but not in the hippocampus. These data indicate that hippocampal and prefrontal-striatal brain regions differentially contribute to instance-based and rule-based learning.
Collapse
|
231
|
Chater N, Tenenbaum JB, Yuille A. Probabilistic models of cognition: conceptual foundations. Trends Cogn Sci 2006; 10:287-91. [PMID: 16807064 DOI: 10.1016/j.tics.2006.05.007] [Citation(s) in RCA: 306] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2006] [Revised: 03/31/2006] [Accepted: 05/30/2006] [Indexed: 11/29/2022]
Abstract
Remarkable progress in the mathematics and computer science of probability has led to a revolution in the scope of probabilistic models. In particular, 'sophisticated' probabilistic methods apply to structured relational systems such as graphs and grammars, of immediate relevance to the cognitive sciences. This Special Issue outlines progress in this rapidly developing field, which provides a potentially unifying perspective across a wide range of domains and levels of explanation. Here, we introduce the historical and conceptual foundations of the approach, explore how the approach relates to studies of explicit probabilistic reasoning, and give a brief overview of the field as it stands today.
Collapse
|
232
|
Courville AC, Daw ND, Touretzky DS. Bayesian theories of conditioning in a changing world. Trends Cogn Sci 2006; 10:294-300. [PMID: 16793323 DOI: 10.1016/j.tics.2006.05.004] [Citation(s) in RCA: 309] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2005] [Revised: 03/30/2006] [Accepted: 05/30/2006] [Indexed: 11/22/2022]
Abstract
The recent flowering of Bayesian approaches invites the re-examination of classic issues in behavior, even in areas as venerable as Pavlovian conditioning. A statistical account can offer a new, principled interpretation of behavior, and previous experiments and theories can inform many unexplored aspects of the Bayesian enterprise. Here we consider one such issue: the finding that surprising events provoke animals to learn faster. We suggest that, in a statistical account of conditioning, surprise signals change and therefore uncertainty and the need for new learning. We discuss inference in a world that changes and show how experimental results involving surprise can be interpreted from this perspective, and also how, thus understood, these phenomena help constrain statistical theories of animal and human learning.
Collapse
Affiliation(s)
- Aaron C Courville
- Département d'Informatique et de Recherche Opérationnelle, Université de Montréal, Montréal, QC H3C 3J7, Canada.
| | | | | |
Collapse
|
233
|
Hogarth L, Dickinson A, Hutton SB, Elbers N, Duka T. Drug expectancy is necessary for stimulus control of human attention, instrumental drug-seeking behaviour and subjective pleasure. Psychopharmacology (Berl) 2006; 185:495-504. [PMID: 16547713 DOI: 10.1007/s00213-005-0287-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2005] [Accepted: 11/30/2005] [Indexed: 11/25/2022]
Abstract
BACKGROUND It has been suggested that drug-paired stimuli (S+) control addictive behaviour by eliciting an explicit mental representation or expectation of drug availability. AIMS The aim of the present study was to test this hypothesis by determining whether the behavioural control exerted by a tobacco-paired S+ in human smokers would depend upon the S+ eliciting an explicit expectation of tobacco. DESIGN In each trial, human smokers (n=16) were presented with stimuli for which attention was measured with an eyetracker. Participants then reported their cigarette reward expectancy before performing, or not, an instrumental tobacco-seeking response that was rewarded with cigarette gains if the S+ had been presented or punished with cigarette losses if the S- had been presented. Following training, participants rated the pleasantness of stimuli. RESULTS The S+ only brought about conditioned behaviour in an aware group (those who expected the cigarette reward outcome when presented with the S+). This aware group allocated attention to the S+, performed the instrumental tobacco-seeking response selectively in the presence of the S+ and rated the S+ as pleasant. No conditioned behaviour was seen in the unaware group (those who did not expect the cigarette reward outcome in the presence of the S+). CONCLUSIONS Drug-paired stimuli control selective attention, instrumental drug-seeking behaviour and positive emotional state by eliciting an explicit expectation of drug availability.
Collapse
Affiliation(s)
- Lee Hogarth
- Laboratory of Experimental Psychology, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, UK
| | | | | | | | | |
Collapse
|
234
|
Suzuki WA, Brown EN. Behavioral and neurophysiological analyses of dynamic learning processes. ACTA ACUST UNITED AC 2006; 4:67-95. [PMID: 16251726 DOI: 10.1177/1534582305280030] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this article, the authors address two topics relevant to the study of the brain basis of associative learning. In Part 1, they compare and contrast the patterns and time course of dynamic learning-related neural activity that have been reported in the medial temporal lobe, premotor cortex, prefrontal cortex, and striatum during various associative learning tasks. In Part 2, they examine the statistical methodologies that have been used to analyze both behavioral learning and learning-related neural activity. They describe a state-space model of behavioral learning that provides accurate estimates of dynamic learning processes and a point-process filter algorithm that tracks the dynamic changes in neural activity on a millisecond time scale. Future challenges for these statistical methodologies and their application to the study of the brain basis of associative learning are discussed.
Collapse
|
235
|
Hogarth L, Duka T. Human nicotine conditioning requires explicit contingency knowledge: is addictive behaviour cognitively mediated? Psychopharmacology (Berl) 2006; 184:553-66. [PMID: 16175406 DOI: 10.1007/s00213-005-0150-0] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2005] [Accepted: 07/31/2005] [Indexed: 01/17/2023]
Abstract
RATIONALE Two seemingly contrary theories describe the learning mechanisms that mediate human addictive behaviour. According to the classical incentive theories of addiction, addictive behaviour is motivated by a Pavlovian conditioned appetitive emotional response elicited by drug-paired stimuli. Expectancy theory, on the other hand, argues that addictive behaviour is mediated by an expectancy of the drug imparted by cognitive knowledge of the Pavlovian (predictive) contingency between stimuli (S+) and the drug and of the instrumental (causal) contingency between instrumental behaviour and the drug. AIMS AND METHOD The present paper reviewed human-nicotine-conditioning studies to assess the role of appetitive emotional conditioning and explicit contingency knowledge in mediating addictive behaviour. RESULTS The studies reviewed here provided evidence for both the emotional conditioning and the expectancy accounts. The first source of evidence is that nicotine-paired S+ elicit an appetitive emotional conditioned response (CR), albeit only in participants who expect nicotine. Furthermore, the magnitude of this emotional state is modulated by nicotine deprivation/satiation. However, the causal status of the emotional response in driving other forms of conditioned behaviour remains undemonstrated. The second source of evidence is that other nicotine CRs, including physiological responses, self-administration, attentional bias and subjective craving, are also dependent on participants possessing explicit knowledge of the Pavlovian contingencies arranged in the experiment. In addition, several of the nicotine CRs can be brought about or modified by instructed contingency knowledge, demonstrating the causal status of this knowledge. CONCLUSIONS Collectively, these data suggest that human nicotine conditioned effects are mediated by an explicit expectancy of the drug coupled with an appetitive emotional response that reflects the positive biological value of the drug. The implication of this conclusion is that treatments designed to modify the expected value of the drug may prove effective.
Collapse
Affiliation(s)
- Lee Hogarth
- Laboratory of Experimental Psychology, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, UK
| | | |
Collapse
|
236
|
Haruno M, Kawato M. Different Neural Correlates of Reward Expectation and Reward Expectation Error in the Putamen and Caudate Nucleus During Stimulus-Action-Reward Association Learning. J Neurophysiol 2006; 95:948-59. [PMID: 16192338 DOI: 10.1152/jn.00382.2005] [Citation(s) in RCA: 309] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
To select appropriate behaviors leading to rewards, the brain needs to learn associations among sensory stimuli, selected behaviors, and rewards. Recent imaging and neural-recording studies have revealed that the dorsal striatum plays an important role in learning such stimulus-action-reward associations. However, the putamen and caudate nucleus are embedded in distinct cortico-striatal loop circuits, predominantly connected to motor-related cerebral cortical areas and frontal association areas, respectively. This difference in their cortical connections suggests that the putamen and caudate nucleus are engaged in different functional aspects of stimulus-action-reward association learning. To determine whether this is the case, we conducted an event-related and computational model–based functional MRI (fMRI) study with a stochastic decision-making task in which a stimulus-action-reward association must be learned. A simple reinforcement learning model not only reproduced the subject's action selections reasonably well but also allowed us to quantitatively estimate each subject's temporal profiles of stimulus-action-reward association and reward-prediction error during learning trials. These two internal representations were used in the fMRI correlation analysis. The results revealed that neural correlates of the stimulus-action-reward association reside in the putamen, whereas a correlation with reward-prediction error was found largely in the caudate nucleus and ventral striatum. These nonuniform spatiotemporal distributions of neural correlates within the dorsal striatum were maintained consistently at various levels of task difficulty, suggesting a functional difference in the dorsal striatum between the putamen and caudate nucleus during stimulus-action-reward association learning.
Collapse
Affiliation(s)
- Masahiko Haruno
- Department of Cognitive Neuroscience Computational Neuroscience Labs, Advanced Telecommunication Research Institute, Sorakugun, Kyoto 619-0288, Japan.
| | | |
Collapse
|
237
|
Abstract
In multiple-cue learning (also known as probabilistic category learning) people acquire information about cue-outcome relations and combine these into predictions or judgments. Previous researchers claimed that people can achieve high levels of performance without explicit knowledge of the task structure or insight into their own judgment policies. It has also been argued that people use a variety of suboptimal strategies to solve such tasks. In three experiments the authors reexamined these conclusions by introducing novel measures of task knowledge and self-insight and using "rolling regression" methods to analyze individual learning. Participants successfully learned a four-cue probabilistic environment and showed accurate knowledge of both the task structure and their own judgment processes. Learning analyses suggested that the apparent use of suboptimal strategies emerges from the incremental tracking of statistical contingencies in the environment.
Collapse
Affiliation(s)
- David A Lagnado
- Department of Psychology, University College London, London, United Kingdom.
| | | | | | | |
Collapse
|
238
|
Lee KW, Buxton H, Feng J. Cue-guided search: a computational model of selective attention. ACTA ACUST UNITED AC 2005; 16:910-24. [PMID: 16121732 DOI: 10.1109/tnn.2005.851787] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Selective visual attention in a natural environment can be seen as the interaction between the external visual stimulus and task specific knowledge of the required behavior. This interaction between the bottom-up stimulus and the top-down, task-related knowledge is crucial for what is selected in the space and time within the scene. In this paper, we propose a computational model for selective attention for a visual search task. We go beyond simple saliency-based attention models to model selective attention guided by top-down visual cues, which are dynamically integrated with the bottom-up information. In this way, selection of a location is accomplished by interaction between bottom-up and top-down information. First, the general structure of our model is briefly introduced and followed by a description of the top-down processing of task-relevant cues. This is then followed by a description of the processing of the external images to give three feature maps that are combined to give an overall bottom-up map. Second, the development of the formalism for our novel interactive spiking neural network (ISNN) is given, with the interactive activation rule that calculates the integration map. The learning rule for both bottom-up and top-down weight parameters are given, together with some further analysis of the properties of the resulting ISNN. Third, the model is applied to a face detection task to search for the location of a specific face that is cued. The results show that the trajectories of attention are dramatically changed by interaction of information and variations of cues, giving an appropriate, task-relevant search pattern. Finally, we discuss ways in which these results can be seen as compatible with existing psychological evidence.
Collapse
Affiliation(s)
- Kang Woo Lee
- Department of Informatics, Sussex University, Brighton BN1 9QH, UK
| | | | | |
Collapse
|
239
|
Wiech K, Seymour B, Kalisch R, Stephan KE, Koltzenburg M, Driver J, Dolan RJ. Modulation of pain processing in hyperalgesia by cognitive demand. Neuroimage 2005; 27:59-69. [PMID: 15978845 DOI: 10.1016/j.neuroimage.2005.03.044] [Citation(s) in RCA: 115] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2004] [Revised: 03/07/2005] [Accepted: 03/28/2005] [Indexed: 11/20/2022] Open
Abstract
The relationship between pain and cognitive function is of theoretical and clinical interest, exemplified by observations that attention-demanding activities reduce pain in chronically afflicted patients. Previous studies have concentrated on phasic pain, which bears little correspondence to clinical pain conditions. Indeed, phasic pain is often associated with differential or opposing effects to tonic pain in behavioral, lesion, and pharmacological studies. To address how cognitive engagement interacts with tonic pain, we assessed the influence of an attention-demanding cognitive task on pain-evoked neural responses in an experimental model of chronic pain, the capsaicin-induced heat hyperalgesia model. Using functional magnetic resonance imaging (fMRI), we show that activity in the orbitofrontal and medial prefrontal cortices, insula, and cerebellum correlates with the intensity of tonic pain. This pain-related activity in medial prefrontal cortex and cerebellum was modulated by the demand level of the cognitive task. Our findings highlight a role for these structures in the integration of motivational and cognitive functions associated with a physiological state of injury. Within the limitations of an experimental model of pain, we suggest that the findings are relevant to understanding both the neurobiology and pathophysiology of chronic pain and its amelioration by cognitive strategies.
Collapse
Affiliation(s)
- Katja Wiech
- Wellcome Department of Imaging Neuroscience, Institute of Neurology, UCL, 12 Queen Square, London WC1N 3BG, UK.
| | | | | | | | | | | | | |
Collapse
|
240
|
Yu AJ, Dayan P. Uncertainty, neuromodulation, and attention. Neuron 2005; 46:681-92. [PMID: 15944135 DOI: 10.1016/j.neuron.2005.04.026] [Citation(s) in RCA: 1010] [Impact Index Per Article: 53.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2004] [Revised: 03/16/2005] [Accepted: 04/21/2005] [Indexed: 10/25/2022]
Abstract
Uncertainty in various forms plagues our interactions with the environment. In a Bayesian statistical framework, optimal inference and prediction, based on unreliable observations in changing contexts, require the representation and manipulation of different forms of uncertainty. We propose that the neuromodulators acetylcholine and norepinephrine play a major role in the brain's implementation of these uncertainty computations. Acetylcholine signals expected uncertainty, coming from known unreliability of predictive cues within a context. Norepinephrine signals unexpected uncertainty, as when unsignaled context switches produce strongly unexpected observations. These uncertainty signals interact to enable optimal inference and learning in noisy and changeable environments. This formulation is consistent with a wealth of physiological, pharmacological, and behavioral data implicating acetylcholine and norepinephrine in specific aspects of a range of cognitive processes. Moreover, the model suggests a class of attentional cueing tasks that involve both neuromodulators and shows how their interactions may be part-antagonistic, part-synergistic.
Collapse
Affiliation(s)
- Angela J Yu
- Gatsby Computational Neuroscience Unit, London, United Kingdom.
| | | |
Collapse
|
241
|
Niv Y, Duff MO, Dayan P. Dopamine, uncertainty and TD learning. Behav Brain Funct 2005; 1:6. [PMID: 15953384 PMCID: PMC1171969 DOI: 10.1186/1744-9081-1-6] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2005] [Accepted: 05/04/2005] [Indexed: 11/11/2022] Open
Abstract
Substantial evidence suggests that the phasic activities of dopaminergic neurons in the primate midbrain represent a temporal difference (TD) error in predictions of future reward, with increases above and decreases below baseline consequent on positive and negative prediction errors, respectively. However, dopamine cells have very low baseline activity, which implies that the representation of these two sorts of error is asymmetric. We explore the implications of this seemingly innocuous asymmetry for the interpretation of dopaminergic firing patterns in experiments with probabilistic rewards which bring about persistent prediction errors. In particular, we show that when averaging the non-stationary prediction errors across trials, a ramping in the activity of the dopamine neurons should be apparent, whose magnitude is dependent on the learning rate. This exact phenomenon was observed in a recent experiment, though being interpreted there in antipodal terms as a within-trial encoding of uncertainty.
Collapse
Affiliation(s)
- Yael Niv
- Interdisciplinary Center for Neural Computation, Hebrew University, Jerusalem, Israel
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Michael O Duff
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| |
Collapse
|
242
|
Steele JD, Meyer M, Ebmeier KP. Neural predictive error signal correlates with depressive illness severity in a game paradigm. Neuroimage 2004; 23:269-80. [PMID: 15325374 DOI: 10.1016/j.neuroimage.2004.04.023] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2004] [Revised: 03/28/2004] [Accepted: 04/21/2004] [Indexed: 11/20/2022] Open
Abstract
Considerable experimental evidence supports the existence of predictive error signals in various brain regions during associative learning in animals and humans. These regions include the prefrontal cortex, temporal lobe, cerebellum and monoamine systems. Various quantitative theories have been developed to describe behaviour during learning, including Rescorla-Wagner, Temporal Difference and Kalman filter models. These theories may also account for neural error signals. Reviews of imaging studies of depressive illness have consistently implicated the prefrontal and temporal lobes as having abnormal function, and sometimes structure, whilst the monoamine systems are directly influenced by antidepressant medication. It was hypothesised that such abnormalities may be associated with a dysfunction of associative learning that would be reflected by different predictive error signals in depressed patients when compared with healthy controls. This was tested with 30 subjects, 15 with a major depressive illness, using a gambling paradigm and fMRI. Consistent with the hypothesis, depressed patients differed from controls in having an increased error signal. Additionally, for some brain regions, the magnitude of the error signal correlated with Hamilton depression rating of illness severity. Structural equation modelling was used to investigate hypothesised change in effective connectivity between prespecified regions of interest in the limbic and paralimbic system. Again, differences were found that in some cases correlated with illness severity. These results are discussed in the context of quantitative theories of brain function, clinical features of depressive illness and treatments.
Collapse
Affiliation(s)
- J D Steele
- Department of Mental Health, University of Aberdeen, Aberdeen, AB25 2ZH, UK.
| | | | | |
Collapse
|
243
|
Yamada H, Matsumoto N, Kimura M. Tonically active neurons in the primate caudate nucleus and putamen differentially encode instructed motivational outcomes of action. J Neurosci 2004; 24:3500-10. [PMID: 15071097 PMCID: PMC6729748 DOI: 10.1523/jneurosci.0068-04.2004] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
To achieve a goal, animals procure immediately available rewards, escape from aversive events, or endure the absence of rewards. The neuronal substrates for these goal-directed actions include the limbic system and the basal ganglia. In the striatum, tonically active neurons (TANs), presumed cholinergic interneurons, were originally shown to respond to reward-associated stimuli and to evolve their activity through learning. Subsequent studies revealed that they also respond to aversive event-associated stimuli such as an airpuff on the face and that they are less selective to whether the stimuli instruct reward or no reward. To address this paradox, we designed a set of experiments in which macaque monkeys performed a set of visual reaction time tasks while expecting a reward, during escape from an aversive event, and in the absence of a reward. We found that TANs respond to instruction stimuli associated with motivational outcomes (312 of 390; 80%) but not to unassociated ones (51 of 390; 13%), and that they mostly differentiate associated instructions (217 of 312; 70%). We also found that a higher percentage of TANs in the caudate nucleus respond to stimuli associated with motivational outcomes (118 of 128; 92%) than in the putamen (194 of 262; 74%), whereas a higher percentage of TANs in the putamen respond to go signals for the lever release (112 of 262; 43%) than in the caudate nucleus (27 of 128; 21%), especially for an action expecting a reward. These findings suggest a distinct, pivotal role of TANs in the caudate nucleus and putamen in encoding instructed motivational contexts for goal-directed action planning and learning.
Collapse
Affiliation(s)
- Hiroshi Yamada
- Department of Physiology, Kyoto Prefectural University of Medicine, Kawaramachi-Hirokoji, Kamigyo-ku, Kyoto 602-8566, Japan
| | | | | |
Collapse
|
244
|
Aron AR, Shohamy D, Clark J, Myers C, Gluck MA, Poldrack RA. Human Midbrain Sensitivity to Cognitive Feedback and Uncertainty During Classification Learning. J Neurophysiol 2004; 92:1144-52. [PMID: 15014103 DOI: 10.1152/jn.01209.2003] [Citation(s) in RCA: 202] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Mesencephalic dopaminergic system (MDS) neurons may participate in learning by providing a prediction error signal to their targets, which include ventral striatal, orbital, and medial frontal regions, as well as by showing sensitivity to the degree of uncertainty associated with individual stimuli. We investigated the mechanisms of probabilistic classification learning in humans using functional magnetic resonance imaging to examine the effects of feedback and uncertainty. The design was optimized for separating neural responses to stimulus, delay, and negative and positive feedback components. Compared with fixation, stimulus and feedback activated brain regions consistent with the MDS, whereas the delay period did not. Midbrain activity was significantly different for negative versus positive feedback (consistent with coding of the “prediction error”) and was reliably correlated with the degree of uncertainty as well as with activity in MDS target regions. Purely cognitive feedback apparently engages the same regions as rewarding stimuli, consistent with a broader characterization of this network.
Collapse
Affiliation(s)
- A R Aron
- Dept. of Psychology and Brain Research Institute, University of California-Los Angeles, CA 90065, USA
| | | | | | | | | | | |
Collapse
|
245
|
Abstract
We recorded the activity of midbrain dopamine neurons in an instrumental conditioning task in which monkeys made a series of behavioral decisions on the basis of distinct reward expectations. Dopamine neurons responded to the first visual cue that appeared in each trial [conditioned stimulus (CS)] through which monkeys initiated trial for decision while expecting trial-specific reward probability and volume. The magnitude of neuronal responses to the CS was approximately proportional to reward expectations but with considerable discrepancy. In contrast, CS responses appear to represent motivational properties, because their magnitude at trials with identical reward expectation had significant negative correlation with reaction times of the animal after the CS. Dopamine neurons also responded to reinforcers that occurred after behavioral decisions, and the responses precisely encoded positive and negative reward expectation errors (REEs). The gain of coding REEs by spike frequency increased during learning act-outcome contingencies through a few months of task training, whereas coding of motivational properties remained consistent during the learning. We found that the magnitude of CS responses was positively correlated with that of reinforcers, suggesting a modulation of the effectiveness of REEs as a teaching signal by motivation. For instance, rate of learning could be faster when animals are motivated, whereas it could be slower when less motivated, even at identical REEs. Therefore, the dual correlated coding of motivation and REEs suggested the involvement of the dopamine system, both in reinforcement in more elaborate ways than currently proposed and in motivational function in reward-based decision-making and learning.
Collapse
|
246
|
O'Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ. Temporal difference models and reward-related learning in the human brain. Neuron 2003; 38:329-37. [PMID: 12718865 DOI: 10.1016/s0896-6273(03)00169-7] [Citation(s) in RCA: 987] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Temporal difference learning has been proposed as a model for Pavlovian conditioning, in which an animal learns to predict delivery of reward following presentation of a conditioned stimulus (CS). A key component of this model is a prediction error signal, which, before learning, responds at the time of presentation of reward but, after learning, shifts its response to the time of onset of the CS. In order to test for regions manifesting this signal profile, subjects were scanned using event-related fMRI while undergoing appetitive conditioning with a pleasant taste reward. Regression analyses revealed that responses in ventral striatum and orbitofrontal cortex were significantly correlated with this error signal, suggesting that, during appetitive conditioning, computations described by temporal difference learning are expressed in the human brain.
Collapse
Affiliation(s)
- John P O'Doherty
- Wellcome Department of Imaging Neuroscience, Institute of Neurology, University College London, WC1N 3BG, London, United Kingdom.
| | | | | | | | | |
Collapse
|
247
|
Abstract
In this article, we present an isotropic unsupervised algorithm for temporal sequence learning. No special reward signal is used such that all inputs are completely isotropic. All input signals are bandpass filtered before converging onto a linear output neuron. All synaptic weights change according to the correlation of bandpass-filtered inputs with the derivative of the output. We investigate the algorithm in an open- and a closed-loop condition, the latter being defined by embedding the learning system into a behavioral feedback loop. In the open-loop condition, we find that the linear structure of the algorithm allows analytically calculating the shape of the weight change, which is strictly heterosynaptic and follows the shape of the weight change curves found in spike-time-dependent plasticity. Furthermore, we show that synaptic weights stabilize automatically when no more temporal differences exist between the inputs without additional normalizing measures. In the second part of this study, the algorithm is is placed in an environment that leads to closed sensor-motor loop. To this end, a robot is programmed with a prewired retraction reflex reaction in response to collisions. Through isotropic sequence order (ISO) learning, the robot achieves collision avoidance by learning the correlation between his early range-finder signals and the later occurring collision signal. Synaptic weights stabilize at the end of learning as theoretically predicted. Finally, we discuss the relation of ISO learning with other drive reinforcement models and with the commonly used temporal difference learning algorithm. This study is followed up by a mathematical analysis of the closed-loop situation in the companion article in this issue, "ISO Learning Approximates a Solution to the Inverse-Controller Problem in an Unsupervised Behavioral Paradigm" (pp. 865-884).
Collapse
Affiliation(s)
- Bernd Porr
- Department of Psychology, University of Stirling, Scotland.
| | | |
Collapse
|
248
|
Affiliation(s)
- Peter Shizgal
- Center for Studies in Behavioral Neurobiology, Concordia University, Montréal, Quebec, H3G 1M8, Canada.
| | | |
Collapse
|
249
|
Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 2003; 299:1898-902. [PMID: 12649484 DOI: 10.1126/science.1077349] [Citation(s) in RCA: 1210] [Impact Index Per Article: 57.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Uncertainty is critical in the measure of information and in assessing the accuracy of predictions. It is determined by probability P, being maximal at P = 0.5 and decreasing at higher and lower probabilities. Using distinct stimuli to indicate the probability of reward, we found that the phasic activation of dopamine neurons varied monotonically across the full range of probabilities, supporting past claims that this response codes the discrepancy between predicted and actual reward. In contrast, a previously unobserved response covaried with uncertainty and consisted of a gradual increase in activity until the potential time of reward. The coding of uncertainty suggests a possible role for dopamine signals in attention-based learning and risk-taking behavior.
Collapse
|
250
|
Abstract
A recent flurry of neuroimaging and decision-making experiments in humans, when combined with single-unit data from orbitofrontal cortex, suggests major additions to current models of reward processing. We review these data and models and use them to develop a specific computational relationship between the value of a predictor and the future rewards or punishments that it promises. The resulting computational model, the predictor-valuation model (PVM), is shown to anticipate a class of single-unit neural responses in orbitofrontal and striatal neurons. The model also suggests how neural responses in the orbitofrontal-striatal circuit may support the conversion of disparate types of future rewards into a kind of internal currency, that is, a common scale used to compare the valuation of future behavioral acts or stimuli.
Collapse
Affiliation(s)
- P Read Montague
- Center for Theoretical Neuroscience, Human NeuroImaging Lab, Division of Neuroscience, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA.
| | | |
Collapse
|