601
|
Gallistel CR, King AP, Gottlieb D, Balci F, Papachristos EB, Szalecki M, Carbone KS. Is matching innate? J Exp Anal Behav 2007; 87:161-99. [PMID: 17465311 PMCID: PMC1832166 DOI: 10.1901/jeab.2007.92-05] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2005] [Accepted: 11/17/2006] [Indexed: 10/22/2022]
Abstract
Experimentally naive mice matched the proportions of their temporal investments (visit durations) in two feeding hoppers to the proportions of the food income (pellets per unit session time) derived from them in three experiments that varied the coupling between the behavioral investment and food income, from no coupling to strict coupling. Matching was observed from the outset; it did not improve with training. When the numbers of pellets received were proportional to time invested, investment was unstable, swinging abruptly from sustained, almost complete investment in one hopper, to sustained, almost complete investment in the other-in the absence of appropriate local fluctuations in returns (pellets obtained per time invested). The abruptness of the swings strongly constrains possible models. We suggest that matching reflects an innate (unconditioned) program that matches the ratio of expected visit durations to the ratio between the current estimates of expected incomes. A model that processes the income stream looking for changes in the income and generates discontinuous income estimates when a change is detected is shown to account for salient features of the data.
Collapse
|
602
|
Steenbeek HW, van Geert PL. A theory and dynamic model of dyadic interaction: Concerns, appraisals, and contagiousness in a developmental context. DEVELOPMENTAL REVIEW 2007. [DOI: 10.1016/j.dr.2006.06.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
603
|
Bogacz R. Optimal decision-making theories: linking neurobiology with behaviour. Trends Cogn Sci 2007; 11:118-25. [PMID: 17276130 DOI: 10.1016/j.tics.2006.12.006] [Citation(s) in RCA: 227] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2006] [Revised: 12/14/2006] [Accepted: 12/20/2006] [Indexed: 10/23/2022]
Abstract
This article reviews recently proposed theories postulating that, during simple choices, the brain performs statistically optimal decision making. These theories are ecologically motivated by evolutionary pressures to optimize the speed and accuracy of decisions and to maximize the rate of receiving rewards for correct choices. This article suggests that the models of decision making that are proposed on different levels of abstraction can be linked by virtue of the same optimal computation. Also reviewed here are recent observations that many aspects of the circuit that involves the cortex and basal ganglia are the same as those that are required to perform statistically optimal choice. This review illustrates how optimal-decision theories elucidate current data and provide experimental predictions that concern both neurobiology and behaviour.
Collapse
Affiliation(s)
- Rafal Bogacz
- Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK.
| |
Collapse
|
604
|
Abstract
Optimal behavior in a competitive world requires the flexibility to adapt decision strategies based on recent outcomes. In the present study, we tested the hypothesis that this flexibility emerges through a reinforcement learning process, in which reward prediction errors are used dynamically to adjust representations of decision options. We recorded event-related brain potentials (ERPs) while subjects played a strategic economic game against a computer opponent to evaluate how neural responses to outcomes related to subsequent decision-making. Analyses of ERP data focused on the feedback-related negativity (FRN), an outcome-locked potential thought to reflect a neural prediction error signal. Consistent with predictions of a computational reinforcement learning model, we found that the magnitude of ERPs after losing to the computer opponent predicted whether subjects would change decision behavior on the subsequent trial. Furthermore, FRNs to decision outcomes were disproportionately larger over the motor cortex contralateral to the response hand that was used to make the decision. These findings provide novel evidence that humans engage a reinforcement learning process to adjust representations of competing decision options.
Collapse
Affiliation(s)
- Michael X Cohen
- Department of Epileptology and Center for Mind and Brain, University of Bonn, 53105 Bonn, Germany.
| | | |
Collapse
|
605
|
Stoet G, Snyder LH. Correlates of Stimulus-Response Congruence in the Posterior Parietal Cortex. J Cogn Neurosci 2007; 19:194-203. [PMID: 17280509 DOI: 10.1162/jocn.2007.19.2.194] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Primate behavior is flexible: The response to a stimulus often depends on the task in which it occurs. Here we study how single neurons in the posterior parietal cortex (PPC) respond to stimuli which are associated with different responses in different tasks. Two rhesus monkeys performed a task-switching paradigm. Each trial started with a task cue instructing which of two tasks to perform, followed by a stimulus requiring a left or right button press. For half the stimuli, the associated responses were different in the two tasks, meaning that the task context was necessary to disambiguate the incongruent stimuli. The other half of stimuli required the same response irrespective of task context (congruent). Using this paradigm, we previously showed that behavioral responses to incongruent stimuli are significantly slower than to congruent stimuli. We now demonstrate a neural correlate in the PPC of the additional processing time required for incongruent stimuli. Furthermore, we previously found that 29% of parietal neurons encode the task being performed (task-selective cells). We now report differences in neuronal timing related to congruency in task-selective versus task nonselective cells. These differences in timing suggest that the activity in task nonselective cells reflects a motor command, whereas activity in task-selective cells reflects a decision process.
Collapse
|
606
|
Hampton AN, O'Doherty JP. Decoding the neural substrates of reward-related decision making with functional MRI. Proc Natl Acad Sci U S A 2007; 104:1377-82. [PMID: 17227855 PMCID: PMC1783089 DOI: 10.1073/pnas.0606297104] [Citation(s) in RCA: 197] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although previous studies have implicated a diverse set of brain regions in reward-related decision making, it is not yet known which of these regions contain information that directly reflects a decision. Here, we measured brain activity using functional MRI in a group of subjects while they performed a simple reward-based decision-making task: probabilistic reversal-learning. We recorded brain activity from nine distinct regions of interest previously implicated in decision making and separated out local spatially distributed signals in each region from global differences in signal. Using a multivariate analysis approach, we determined the extent to which global and local signals could be used to decode subjects' subsequent behavioral choice, based on their brain activity on the preceding trial. We found that subjects' decisions could be decoded to a high level of accuracy on the basis of both local and global signals even before they were required to make a choice, and even before they knew which physical action would be required. Furthermore, the combined signals from three specific brain areas (anterior cingulate cortex, medial prefrontal cortex, and ventral striatum) were found to provide all of the information sufficient to decode subjects' decisions out of all of the regions we studied. These findings implicate a specific network of regions in encoding information relevant to subsequent behavioral choice.
Collapse
Affiliation(s)
| | - John P. O'Doherty
- Computation and Neural Systems Program
- Division of Humanities and Social Sciences, California Institute of Technology, 1200 East California Boulevard, M/C 228-77, Pasadena, CA 91125
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
607
|
Abstract
The lateral intraparietal area (LIP) is a subdivision of the inferior parietal lobe that has been implicated in the guidance of spatial attention. In a variety of tasks, LIP provides a "salience representation" of the external world-a topographic visual representation that encodes the locations of salient or behaviorally relevant objects. Recent neurophysiological experiments show that this salience representation incorporates information about multiple behavioral variables-such as a specific motor response, reward, or category membership-associated with the task-relevant object. This integration occurs in a wide variety of tasks, including those requiring eye or limb movements or goal-directed or nontargeting operant responses. Thus, LIP acts as a multifaceted behavioral integrator that binds visuospatial, motor, and cognitive information into a topographically organized signal of behavioral salience. By specifying attentional priority as a synthesis of multiple task demands, LIP operates at the interface of perception, action, and cognition.
Collapse
Affiliation(s)
- Jacqueline Gottlieb
- Center for Neurobiology and Behavior and Department of Psychiatry, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
608
|
Abstract
Using GABAergic outputs from the SNr or GP(i), the basal ganglia exert inhibitory control over several motor areas in the brainstem which in turn control the central pattern generators for the basic motor repertoire including eye-head orientation, locomotion, mouth movements, and vocalization. These movements are by default kept suppressed by tonic rapid firing of SNr/GP(i) neurons, but can be released by a selective removal of the tonic inhibition. Derangement of the SNr/GP(i) outputs leads to either an inability to initiate movements (akinesia) or an inability to suppress movements (involuntary movements). Although the spatio-temporal patterns of individual movements are largely innate and fixed, it is essential for survival to select appropriate movements and arrange them in an appropriate order depending on the context, and this is what the basal ganglia presumably do. To achieve such a goal, however, the basal ganglia need to be trained to optimize their outputs with the aid of cortical inputs carrying sensorimotor and cognitive information and dopaminergic inputs carrying reward-related information. The basal ganglia output to the thalamus, which is particularly developed in primates, provides the basal ganglia with an advanced ability to organize behavior by including the motor skill mechanisms in which new movement patterns can be created by practice. To summarize, an essential function of the basal ganglia is to select, sort, and integrate innate movements and learned movements, together with cognitive and emotional mental operations, to achieve purposeful behaviors. Intricate hand-finger movements do not occur in isolation; they are always associated with appropriate motor sets, such as eye-head orientation and posture.
Collapse
Affiliation(s)
- O Hikosaka
- Laboratory of Sensorimotor Research, National Eye Institute, National Institute of Health, 49 Convent Drive, Bldg. 49, Rm. 2A50, Bethesda, MD 20892-4435, USA.
| |
Collapse
|
609
|
Watanabe M, Hikosaka K, Sakagami M, Shirakawa SI. Reward Expectancy-Related Prefrontal Neuronal Activities: Are They Neural Substrates of “Affective” Working Memory? Cortex 2007; 43:53-64. [PMID: 17334207 DOI: 10.1016/s0010-9452(08)70445-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Primate prefrontal delay neurons are involved in retaining task-relevant cognitive information in working memory (WM). Recent studies have also revealed primate prefrontal delay neurons that are related to reward/omission-of-reward expectancy. Such reward-related delay activities might constitute "affective WM" (Davidson, 2002). "Affective" and "cognitive" WM are both concerned with representing not what is currently being presented, but rather what was presented previously or might be presented in the future. However, according to the original and widely accepted definition, WM is the "temporary storage and manipulation of information for complex cognitive tasks". Reward/omission-of-reward expectancy-related neuronal activity is neither prerequisite nor essential for accurate task performance; thus, such activity is not considered to comprise the neural substrates of WM. Also, "affective WM" might not be an appropriate usage of the term "WM". We propose that WM- and reward/omission-of-reward expectancy-related neuronal activity are concerned with representing which response should be performed in order to attain a goal (reward) and the goal of the response, respectively. We further suggest that the prefrontal cortex (PFC) plays a crucial role in the integration of cognitive (for example, WM-related) and motivational (for example, reward expectancy-related) operations for goal-directed behaviour. The PFC could then send this integrated information to other brain areas to control the behaviour.
Collapse
Affiliation(s)
- Masataka Watanabe
- Department of Psychology, Tokyo Metropolitan Institute for Neuroscience, Tokyo, Japan.
| | | | | | | |
Collapse
|
610
|
Emeric EE, Brown JW, Boucher L, Carpenter RHS, Hanes DP, Harris R, Logan GD, Mashru RN, Paré M, Pouget P, Stuphorn V, Taylor TL, Schall JD. Influence of history on saccade countermanding performance in humans and macaque monkeys. Vision Res 2007; 47:35-49. [PMID: 17081584 PMCID: PMC1815391 DOI: 10.1016/j.visres.2006.08.032] [Citation(s) in RCA: 120] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2006] [Revised: 08/18/2006] [Accepted: 08/28/2006] [Indexed: 11/20/2022]
Abstract
The stop-signal or countermanding task probes the ability to control action by requiring subjects to withhold a planned movement in response to an infrequent stop signal which they do with variable success depending on the delay of the stop signal. We investigated whether performance of humans and macaque monkeys in a saccade countermanding task was influenced by stimulus and performance history. In spite of idiosyncrasies across subjects several trends were evident in both humans and monkeys. Response time decreased after successive trials with no stop signal. Response time increased after successive trials with a stop signal. However, post-error slowing was not observed. Increased response time was observed mainly or only after cancelled (signal inhibit) trials and not after noncancelled (signal respond) trials. These global trends were based on rapid adjustments of response time in response to momentary fluctuations in the fraction of stop signal trials. The effects of trial sequence on the probability of responding were weaker and more idiosyncratic across subjects when stop signal fraction was fixed. However, both response time and probability of responding were influenced strongly by variations in the fraction of stop signal trials. These results indicate that the race model of countermanding performance requires extension to account for these sequential dependencies and provide a basis for physiological studies of executive control of countermanding saccade performance.
Collapse
Affiliation(s)
- Erik E. Emeric
- Center for Integrative & Cognitive Neuroscience, Vanderbilt Vision Research Center, Department of Psychology, Vanderbilt University, Nashville, Tennessee
| | - Joshua W. Brown
- Department of Psychology, Washington University, St. Louis, Missouri, USA
| | - Leanne Boucher
- Center for Integrative & Cognitive Neuroscience, Vanderbilt Vision Research Center, Department of Psychology, Vanderbilt University, Nashville, Tennessee
| | | | - Doug P. Hanes
- Center for Integrative & Cognitive Neuroscience, Vanderbilt Vision Research Center, Department of Psychology, Vanderbilt University, Nashville, Tennessee
| | - Robin Harris
- Physiological Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Gordon D. Logan
- Center for Integrative & Cognitive Neuroscience, Vanderbilt Vision Research Center, Department of Psychology, Vanderbilt University, Nashville, Tennessee
| | - Reena N. Mashru
- Physiological Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Martin Paré
- Department of Physiology, Queen’s University, Kingston, Ontario, Canada
| | - Pierre Pouget
- Center for Integrative & Cognitive Neuroscience, Vanderbilt Vision Research Center, Department of Psychology, Vanderbilt University, Nashville, Tennessee
| | - Veit Stuphorn
- Department of Psychology, Johns Hopkins University, Baltimore, Maryland
| | - Tracy L. Taylor
- Department of Psychology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Jeffrey D. Schall
- Center for Integrative & Cognitive Neuroscience, Vanderbilt Vision Research Center, Department of Psychology, Vanderbilt University, Nashville, Tennessee
| |
Collapse
|
611
|
Soltani A, Lee D, Wang XJ. Neural mechanism for stochastic behaviour during a competitive game. Neural Netw 2006; 19:1075-90. [PMID: 17015181 PMCID: PMC1752206 DOI: 10.1016/j.neunet.2006.05.044] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Accepted: 05/22/2006] [Indexed: 11/18/2022]
Abstract
Previous studies have shown that non-human primates can generate highly stochastic choice behaviour, especially when this is required during a competitive interaction with another agent. To understand the neural mechanism of such dynamic choice behaviour, we propose a biologically plausible model of decision making endowed with synaptic plasticity that follows a reward-dependent stochastic Hebbian learning rule. This model constitutes a biophysical implementation of reinforcement learning, and it reproduces salient features of behavioural data from an experiment with monkeys playing a matching pennies game. Due to interaction with an opponent and learning dynamics, the model generates quasi-random behaviour robustly in spite of intrinsic biases. Furthermore, non-random choice behaviour can also emerge when the model plays against a non-interactive opponent, as observed in the monkey experiment. Finally, when combined with a meta-learning algorithm, our model accounts for the slow drift in the animal's strategy based on a process of reward maximization.
Collapse
Affiliation(s)
- Alireza Soltani
- Department of Physics and Volen Center for Complex Systems, Brandeis University, Waltham, MA 02454, USA.
| | | | | |
Collapse
|
612
|
Abstract
We studied human movement planning in tasks in which subjects selected one of two goals that differed in expected gain. Each goal configuration consisted of a target circle and a partially overlapping penalty circle. Rapid hits into the target region led to a monetary bonus; accidental hits into the penalty region incurred a penalty. The outcomes assigned to target and penalty regions and the spatial arrangement of those regions were varied. Subjects preferred configurations with higher expected gain whether selection involved a rapid pointing movement or a choice by key press. Movements executed to select one of two goal configurations exhibited the same movement dynamics as pointing movements directed at a single configuration, and were executed with the same high efficiency. Our results suggest that humans choose near-optimal strategies when planning their movement, and can base their selection of strategy on a rapid judgment about the expected gain associated with possible movement goals.
Collapse
|
613
|
Cisek P. Integrated neural processes for defining potential actions and deciding between them: a computational model. J Neurosci 2006; 26:9761-70. [PMID: 16988047 PMCID: PMC6674435 DOI: 10.1523/jneurosci.5605-05.2006] [Citation(s) in RCA: 211] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
To successfully accomplish a behavioral goal such as reaching for an object, an animal must solve two related problems: to decide which object to reach and to plan the specific parameters of the movement. Traditionally, these two problems have been viewed as separate, and theories of decision making and motor planning have been developed primarily independently. However, neural data suggests that these processes involve the same brain regions and are performed in an integrated manner. Here, a computational model is described that addresses both the question of how different potential actions are specified and how the brain decides between them. In the model, multiple potential actions are simultaneously represented as continuous regions of activity within populations of cells in frontoparietal cortex. These representations engage in a competition for overt execution that is biased by modulatory influences from prefrontal cortex. The model neural populations exhibit activity patterns that correlate with both the spatial metrics of potential actions and their associated decision variables, in a manner similar to activities in parietal, prefrontal, and premotor cortex. The model therefore suggests an explanation for neural data that have been hard to account for in terms of serial theories that propose that decision making occurs before action planning. In addition to simulating the activity of individual neurons during decision tasks, the model also reproduces key aspects of the spatial and temporal statistics of human choices and makes a number of testable predictions.
Collapse
Affiliation(s)
- Paul Cisek
- Department of Physiology, University of Montréal, Montréal, Québec, Canada H3C 3J7.
| |
Collapse
|
614
|
Sakai Y, Okamoto H, Fukai T. Computational algorithms and neuronal network models underlying decision processes. Neural Netw 2006; 19:1091-105. [PMID: 16942856 DOI: 10.1016/j.neunet.2006.05.034] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2005] [Accepted: 05/24/2006] [Indexed: 11/18/2022]
Abstract
Animals or humans often encounter such situations in which they must choose their behavioral responses to be made in the near or distant future. Such a decision is made through continuous and bidirectional interactions between the environment surrounding the brain and its internal state or dynamical processes. Therefore, decision making may provide a unique field of researches for studying information processing by the brain, a biological system open to information exchanges with the external world. To make a decision, the brain must analyze pieces of information given externally, past experiences in a similar situation, possible behavioral responses, and predicted outcomes of the individual responses. In this article, we review results of recent experimental and theoretical studies of neuronal substrates and computational algorithms for decision processes.
Collapse
Affiliation(s)
- Yutaka Sakai
- Department of Intelligent Information Systems, Tamagawa University, Tamagawa Gakeun 6-1-1, Machida, Tokyo, Japan.
| | | | | |
Collapse
|
615
|
Loewenstein Y, Seung HS. Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity. Proc Natl Acad Sci U S A 2006; 103:15224-9. [PMID: 17008410 PMCID: PMC1622804 DOI: 10.1073/pnas.0505220103] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The probability of choosing an alternative in a long sequence of repeated choices is proportional to the total reward derived from that alternative, a phenomenon known as Herrnstein's matching law. This behavior is remarkably conserved across species and experimental conditions, but its underlying neural mechanisms still are unknown. Here, we propose a neural explanation of this empirical law of behavior. We hypothesize that there are forms of synaptic plasticity driven by the covariance between reward and neural activity and prove mathematically that matching is a generic outcome of such plasticity. Two hypothetical types of synaptic plasticity, embedded in decision-making neural network models, are shown to yield matching behavior in numerical simulations, in accord with our general theorem. We show how this class of models can be tested experimentally by making reward not only contingent on the choices of the subject but also directly contingent on fluctuations in neural activity. Maximization is shown to be a generic outcome of synaptic plasticity driven by the sum of the covariances between reward and all past neural activities.
Collapse
Affiliation(s)
- Yonatan Loewenstein
- Howard Hughes Medical Institute and the Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | | |
Collapse
|
616
|
Simen P, Cohen JD, Holmes P. Rapid decision threshold modulation by reward rate in a neural network. Neural Netw 2006; 19:1013-26. [PMID: 16987636 PMCID: PMC1808344 DOI: 10.1016/j.neunet.2006.05.038] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2005] [Accepted: 05/01/2006] [Indexed: 11/18/2022]
Abstract
Optimal performance in two-alternative, free response decision-making tasks can be achieved by the drift-diffusion model of decision making--which can be implemented in a neural network--as long as the threshold parameter of that model can be adapted to different task conditions. Evidence exists that people seek to maximize reward in such tasks by modulating response thresholds. However, few models have been proposed for threshold adaptation, and none have been implemented using neurally plausible mechanisms. Here we propose a neural network that adapts thresholds in order to maximize reward rate. The model makes predictions regarding optimal performance and provides a benchmark against which actual performance can be compared, as well as testable predictions about the way in which reward rate may be encoded by neural mechanisms.
Collapse
Affiliation(s)
- Patrick Simen
- Center for the Study of Brain, Mind and Behavior, Princeton University, Princeton, NJ 08544, USA.
| | | | | |
Collapse
|
617
|
Tanaka SC, Samejima K, Okada G, Ueda K, Okamoto Y, Yamawaki S, Doya K. Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics. Neural Netw 2006; 19:1233-41. [PMID: 16979871 DOI: 10.1016/j.neunet.2006.05.039] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2005] [Accepted: 05/10/2006] [Indexed: 11/19/2022]
Abstract
In learning goal-directed behaviors, an agent has to consider not only the reward given at each state but also the consequences of dynamic state transitions associated with action selection. To understand brain mechanisms for action learning under predictable and unpredictable environmental dynamics, we measured brain activities by functional magnetic resonance imaging (fMRI) during a Markov decision task with predictable and unpredictable state transitions. Whereas the striatum and orbitofrontal cortex (OFC) were significantly activated both under predictable and unpredictable state transition rules, the dorsolateral prefrontal cortex (DLPFC) was more strongly activated under predictable than under unpredictable state transition rules. We then modelled subjects' choice behaviours using a reinforcement learning model and a Bayesian estimation framework and found that the subjects took larger temporal discount factors under predictable state transition rules. Model-based analysis of fMRI data revealed different engagement of striatum in reward prediction under different state transition dynamics. The ventral striatum was involved in reward prediction under both unpredictable and predictable state transition rules, although the dorsal striatum was dominantly involved in reward prediction under predictable rules. These results suggest different learning systems in the cortico-striatum loops depending on the dynamics of the environment: the OFC-ventral striatum loop is involved in action learning based on the present state, while the DLPFC-dorsal striatum loop is involved in action learning based on predictable future states.
Collapse
Affiliation(s)
- Saori C Tanaka
- Department of Bioinformatics and Genomics, Nara Institute of Science and Technology, Japan.
| | | | | | | | | | | | | |
Collapse
|
618
|
Abstract
To make a decision, a system must assign value to each of its available choices. In the human brain, one approach to studying valuation has used rewarding stimuli to map out brain responses by varying the dimension or importance of the rewards. However, theoretical models have taught us that value computations are complex, and so reward probes alone can give only partial information about neural responses related to valuation. In recent years, computationally principled models of value learning have been used in conjunction with noninvasive neuroimaging to tease out neural valuation responses related to reward-learning and decision-making. We restrict our review to the role of these models in a new generation of experiments that seeks to build on a now-large body of diverse reward-related brain responses. We show that the models and the measurements based on them point the way forward in two important directions: the valuation of time and the valuation of fictive experience.
Collapse
Affiliation(s)
- P Read Montague
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | |
Collapse
|
619
|
Abstract
We studied the choice behavior of 2 monkeys in a discrete-trial task with reinforcement contingencies similar to those Herrnstein (1961) used when he described the matching law. In each session, the monkeys experienced blocks of discrete trials at different relative-reinforcer frequencies or magnitudes with unsignalled transitions between the blocks. Steady-state data following adjustment to each transition were well characterized by the generalized matching law; response ratios undermatched reinforcer frequency ratios but matched reinforcer magnitude ratios. We modelled response-by-response behavior with linear models that used past reinforcers as well as past choices to predict the monkeys' choices on each trial. We found that more recently obtained reinforcers more strongly influenced choice behavior. Perhaps surprisingly, we also found that the monkeys' actions were influenced by the pattern of their own past choices. It was necessary to incorporate both past reinforcers and past choices in order to accurately capture steady-state behavior as well as the fluctuations during block transitions and the response-by-response patterns of behavior. Our results suggest that simple reinforcement learning models must account for the effects of past choices to accurately characterize behavior in this task, and that models with these properties provide a conceptual tool for studying how both past reinforcers and past choices are integrated by the neural systems that generate behavior.
Collapse
Affiliation(s)
- Brian Lau
- Center for Neural Science, New York University, New York, New York 10003, USA.
| | | |
Collapse
|
620
|
Corrado GS, Sugrue LP, Seung HS, Newsome WT. Linear-Nonlinear-Poisson models of primate choice dynamics. J Exp Anal Behav 2006; 84:581-617. [PMID: 16596981 PMCID: PMC1389782 DOI: 10.1901/jeab.2005.23-05] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The equilibrium phenomenon of matching behavior traditionally has been studied in stationary environments. Here we attempt to uncover the local mechanism of choice that gives rise to matching by studying behavior in a highly dynamic foraging environment. In our experiments, 2 rhesus monkeys (Macacca mulatta) foraged for juice rewards by making eye movements to one of two colored icons presented on a computer monitor, each rewarded on dynamic variable-interval schedules. Using a generalization of Wiener kernel analysis, we recover a compact mechanistic description of the impact of past reward on future choice in the form of a Linear-Nonlinear-Poisson model. We validate this model through rigorous predictive and generative testing. Compared to our earlier work with this same data set, this model proves to be a better description of choice behavior and is more tightly correlated with putative neural value signals. Refinements over previous models include hyperbolic (as opposed to exponential) temporal discounting of past rewards, and differential (as opposed to fractional) comparisons of option value. Through numerical simulation we find that within this class of strategies, the model parameters employed by animals are very close to those that maximize reward harvesting efficiency.
Collapse
Affiliation(s)
- Greg S Corrado
- Howard Hughes Medical Institute, Stanford University School of Medicine, California 94309, USA.
| | | | | | | |
Collapse
|
621
|
Winger G, Woods JH, Galuska CM, Wade-Galuska T. Behavioral perspectives on the neuroscience of drug addiction. J Exp Anal Behav 2006; 84:667-81. [PMID: 16596985 PMCID: PMC1389786 DOI: 10.1901/jeab.2005.101-04] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Neuroscientific approaches to drug addiction traditionally have been based on the premise that addiction is a process that results from brain changes that in turn result from chronic administration of drugs of abuse. An alternative approach views drug addiction as a behavioral disorder in which drugs function as preeminent reinforcers. Although there is a fundamental discrepancy between these two approaches, the emerging neuroscience of reinforcement and choice behavior eventually may shed light on the brain mechanisms involved in excessive drug use. Behavioral scientists could assist in this understanding by devoting more attention to the assessment of differences in the reinforcing strength of drugs and by attempting to develop and validate behavioral models of addiction.
Collapse
Affiliation(s)
- Gail Winger
- Department of Pharmacology, University of Michigan, Ann Arbor 48109-0632, USA.
| | | | | | | |
Collapse
|
622
|
Hampton AN, Bossaerts P, O’Doherty JP. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J Neurosci 2006; 26:8360-7. [PMID: 16899731 PMCID: PMC6673813 DOI: 10.1523/jneurosci.1010-06.2006] [Citation(s) in RCA: 350] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Many real-life decision-making problems incorporate higher-order structure, involving interdependencies between different stimuli, actions, and subsequent rewards. It is not known whether brain regions implicated in decision making, such as the ventromedial prefrontal cortex (vmPFC), use a stored model of the task structure to guide choice (model-based decision making) or merely learn action or state values without assuming higher-order structure as in standard reinforcement learning. To discriminate between these possibilities, we scanned human subjects with functional magnetic resonance imaging while they performed a simple decision-making task with higher-order structure, probabilistic reversal learning. We found that neural activity in a key decision-making region, the vmPFC, was more consistent with a computational model that exploits higher-order structure than with simple reinforcement learning. These results suggest that brain regions, such as the vmPFC, use an abstract model of task structure to guide behavioral choice, computations that may underlie the human capacity for complex social interactions and abstract strategizing.
Collapse
|
623
|
Pesaran B, Nelson MJ, Andersen RA. Dorsal premotor neurons encode the relative position of the hand, eye, and goal during reach planning. Neuron 2006; 51:125-34. [PMID: 16815337 PMCID: PMC3066049 DOI: 10.1016/j.neuron.2006.05.025] [Citation(s) in RCA: 261] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2005] [Revised: 03/31/2006] [Accepted: 05/26/2006] [Indexed: 10/24/2022]
Abstract
When reaching to grasp an object, we often move our arm and orient our gaze together. How are these movements coordinated? To investigate this question, we studied neuronal activity in the dorsal premotor area (PMd) and the medial intraparietal area (area MIP) of two monkeys while systematically varying the starting position of the hand and eye during reaching. PMd neurons encoded the relative position of the target, hand, and eye. MIP neurons encoded target location with respect to the eye only. These results indicate that whereas MIP encodes target locations in an eye-centered reference frame, PMd uses a relative position code that specifies the differences in locations between all three variables. Such a relative position code may play an important role in coordinating hand and eye movements by computing their relative position.
Collapse
Affiliation(s)
- Bijan Pesaran
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA.
| | | | | |
Collapse
|
624
|
Sohn JW, Lee D. Effects of reward expectancy on sequential eye movements in monkeys. Neural Netw 2006; 19:1181-91. [PMID: 16935467 DOI: 10.1016/j.neunet.2006.04.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2005] [Accepted: 04/07/2006] [Indexed: 11/19/2022]
Abstract
Desirability of an action, often referred to as utility or value, is determined by various factors, such as the probability and timing of expected reward. We investigated how performance of monkeys in an oculomotor serial reaction time task is influenced by multiple motivational factors. The animals produced a series of visually-guided eye movements, while the sequence of target locations and the location of the rewarded target were systematically manipulated. The results show that error rates as well as saccade latencies were consistently influenced by the number of remaining movements necessary to obtain a reward. In addition, when the animal produced multiple saccades before fixating a given target, the first saccade tended to be directed towards the rewarded location, suggesting that saccades to rewarded location and visual target might be programmed concurrently. These results show that monkeys can utilize information about the required sequence of movements to update their subjective values.
Collapse
Affiliation(s)
- Jeong-woo Sohn
- Department of Brain and Cognitive Sciences, Center for Visual Science, University of Rochester, Rochester, NY 14627, USA.
| | | |
Collapse
|
625
|
Morris G, Nevet A, Arkadir D, Vaadia E, Bergman H. Midbrain dopamine neurons encode decisions for future action. Nat Neurosci 2006; 9:1057-63. [PMID: 16862149 DOI: 10.1038/nn1743] [Citation(s) in RCA: 278] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2006] [Accepted: 06/28/2006] [Indexed: 11/09/2022]
Abstract
Current models of the basal ganglia and dopamine neurons emphasize their role in reinforcement learning. However, the role of dopamine neurons in decision making is still unclear. We recorded from dopamine neurons in monkeys engaged in two types of trial: reference trials in an instructed-choice task and decision trials in a two-armed bandit decision task. We show that the activity of dopamine neurons in the decision setting is modulated according to the value of the upcoming action. Moreover, analysis of the probability matching strategy in the decision trials revealed that the dopamine population activity and not the reward during reference trials determines choice behavior. Because dopamine neurons do not have spatial or motor properties, we conclude that immediate decisions are likely to be generated elsewhere and conveyed to the dopamine neurons, which play a role in shaping long-term decision policy through dynamic modulation of the efficacy of basal ganglia synapses.
Collapse
Affiliation(s)
- Genela Morris
- Interdisciplinary Center for Neural Computation (ICNC), Hebrew University, Jerusalem, Israel.
| | | | | | | | | |
Collapse
|
626
|
Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature 2006; 441:876-9. [PMID: 16778890 PMCID: PMC2635947 DOI: 10.1038/nature04766] [Citation(s) in RCA: 1241] [Impact Index Per Article: 68.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2006] [Accepted: 03/30/2006] [Indexed: 01/17/2023]
Abstract
Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this 'exploration-exploitation' dilemma, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning (RL) theory, indicates that a dopaminergic, striatal and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical and ethological perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.
Collapse
Affiliation(s)
- Nathaniel D Daw
- Gatsby Computational Neuroscience Unit, University College London (UCL), Alexandra House, 17 Queen Square, London WC1N 3AR, UK.
| | | | | | | | | |
Collapse
|
627
|
|
628
|
Kennerley SW, Walton ME, Behrens TEJ, Buckley MJ, Rushworth MFS. Optimal decision making and the anterior cingulate cortex. Nat Neurosci 2006; 9:940-7. [PMID: 16783368 DOI: 10.1038/nn1724] [Citation(s) in RCA: 647] [Impact Index Per Article: 35.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2006] [Accepted: 05/23/2006] [Indexed: 11/08/2022]
Abstract
Learning the value of options in an uncertain environment is central to optimal decision making. The anterior cingulate cortex (ACC) has been implicated in using reinforcement information to control behavior. Here we demonstrate that the ACC's critical role in reinforcement-guided behavior is neither in detecting nor in correcting errors, but in guiding voluntary choices based on the history of actions and outcomes. ACC lesions did not impair the performance of monkeys (Macaca mulatta) immediately after errors, but made them unable to sustain rewarded responses in a reinforcement-guided choice task and to integrate risk and payoff in a dynamic foraging task. These data suggest that the ACC is essential for learning the value of actions.
Collapse
Affiliation(s)
- Steven W Kennerley
- Department of Experimental Psychology, South Parks Road, Oxford OX1 3UD, UK.
| | | | | | | | | |
Collapse
|
629
|
Abstract
Expected reward impacts behavior and neuronal activity in brain areas involved in sensorimotor processes. However, where and how reward signals affect sensorimotor signals is unclear. Here, we show evidence that reward-dependent modulation of behavior depends on normal dopamine transmission in the striatum. Monkeys performed a visually guided saccade task in which expected reward gain was different depending on the position of the target. Saccadic reaction times were reliably shorter on large-reward trials than on small-reward trials. When position-reward contingency was switched, the reaction time difference changed rapidly. Injecting dopamine D1 antagonist into the caudate significantly attenuated the reward-dependent saccadic reaction time changes. Conversely, injecting D2 antagonist into the same region enhanced the reward-dependent changes. These results suggest that reward-dependent changes in saccadic eye movements depend partly on dopaminergic modulation of neuronal activity in the caudate nucleus.
Collapse
Affiliation(s)
- Kae Nakamura
- Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, Maryland 20892-4435, USA.
| | | |
Collapse
|
630
|
Bendiksby MS, Platt ML. Neural correlates of reward and attention in macaque area LIP. Neuropsychologia 2006; 44:2411-20. [PMID: 16757005 DOI: 10.1016/j.neuropsychologia.2006.04.011] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2005] [Revised: 04/19/2006] [Accepted: 04/22/2006] [Indexed: 10/24/2022]
Abstract
Saccade reaction times decrease and the frequency of target choices increases with the size of rewards delivered for orienting to a particular visual target. Similarly, increasing rewards for orienting to a visual target enhances neuronal responses in the macaque lateral intraparietal area (LIP), as well as other brain areas. These observations raise several questions. First, are reward-related modulations in neuronal activity in LIP, as well as other areas, spatially specific or more global in nature? Second, to what extent does reward modulation of neuronal activity in area LIP reflect changes in visual rather than motor processing? And third, to what degree are reward-related modulations in LIP activity independent of performance-related modulations thought to reflect changes in attention? Here we show that increasing the size of fluid rewards in blocks reduced saccade reaction times and improved performance in monkeys performing a peripherally-cued saccade task. LIP neurons responded to visual cues spatially segregated from the saccade target, and for many neurons visual responses were systematically modulated by expected reward size. Neuronal responses also were positively correlated with reaction times independent of reward size, consistent with re-orienting of attention to the saccade target. These observations suggest that motivation and attention independently contribute to the strength of sustained visual responses in LIP. Our data thus implicate LIP in the integration of the sensory, motor, and motivational variables that guide orienting.
Collapse
Affiliation(s)
- Michael S Bendiksby
- Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA
| | | |
Collapse
|
631
|
Abstract
Mangabey monkeys have been shown to rely on memory of recent trends in temperature and solar radiation to decide whether to feed on a particular patch of fruit. These observations reveal a rich mental representation of the physical environment in monkeys and suggest foraging may have been an important selective pressure in primate cognitive evolution.
Collapse
Affiliation(s)
- Michael Platt
- Department of Neurobiology, Center for Neuroeconomic Studies, Bryan Research Building, Research Drive, Box 3209, Durham, North Carolina 27710, USA.
| |
Collapse
|
632
|
Salecker I, Häusser M, de Bono M. On the axonal road to circuit function and behaviour: Workshop on The Assembly and Function of Neuronal Circuits. EMBO Rep 2006; 7:585-9. [PMID: 16729018 PMCID: PMC1479602 DOI: 10.1038/sj.embor.7400713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2006] [Accepted: 04/21/2006] [Indexed: 11/08/2022] Open
Affiliation(s)
- Iris Salecker
- Division of Molecular Neurobiology, National Institute for Medical Research, The Ridgeway, London NW7 1AA, UK.
| | | | | |
Collapse
|
633
|
Hanks TD, Ditterich J, Shadlen MN. Microstimulation of macaque area LIP affects decision-making in a motion discrimination task. Nat Neurosci 2006; 9:682-9. [PMID: 16604069 PMCID: PMC2770004 DOI: 10.1038/nn1683] [Citation(s) in RCA: 244] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2005] [Accepted: 03/14/2006] [Indexed: 11/08/2022]
Abstract
A central goal of cognitive neuroscience is to elucidate the neural mechanisms underlying decision-making. Recent physiological studies suggest that neurons in association areas may be involved in this process. To test this, we measured the effects of electrical microstimulation in the lateral intraparietal area (LIP) while monkeys performed a reaction-time motion discrimination task with a saccadic response. In each experiment, we identified a cluster of LIP cells with overlapping response fields (RFs) and sustained activity during memory-guided saccades. Microstimulation of this cluster caused an increase in the proportion of choices toward the RF of the stimulated neurons. Choices toward the stimulated RF were faster with microstimulation, while choices in the opposite direction were slower. Microstimulation never directly evoked saccades, nor did it change reaction times in a simple saccade task. These results demonstrate that the discharge of LIP neurons is causally related to decision formation in the discrimination task.
Collapse
Affiliation(s)
- Timothy D. Hanks
- Howard Hughes Medical Institute, National Primate Research Center, and Department of Physiology & Biophysics, University of Washington, Seattle, Washington 98195, USA
| | - Jochen Ditterich
- Howard Hughes Medical Institute, National Primate Research Center, and Department of Physiology & Biophysics, University of Washington, Seattle, Washington 98195, USA
- Center for Neuroscience, University of California, Davis, California 95616,USA
| | - Michael N. Shadlen
- Howard Hughes Medical Institute, National Primate Research Center, and Department of Physiology & Biophysics, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
634
|
Soltani A, Wang XJ. A biophysically based neural model of matching law behavior: melioration by stochastic synapses. J Neurosci 2006; 26:3731-44. [PMID: 16597727 PMCID: PMC6674121 DOI: 10.1523/jneurosci.5159-05.2006] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
In experiments designed to uncover the neural basis of adaptive decision making in a foraging environment, neuroscientists have reported single-cell activities in the lateral intraparietal cortex (LIP) that are correlated with choice options and their subjective values. To investigate the underlying synaptic mechanism, we considered a spiking neuron model of decision making endowed with synaptic plasticity that follows a reward-dependent stochastic Hebbian learning rule. This general model is tested in a matching task in which rewards on two targets are scheduled randomly with different rates. Our main results are threefold. First, we show that plastic synapses provide a natural way to integrate past rewards and estimate the local (in time) "return" of a choice. Second, our model reproduces the matching behavior (i.e., the proportional allocation of choices matches the relative reinforcement obtained on those choices, which is achieved through melioration in individual trials). Our model also explains the observed "undermatching" phenomenon and points to biophysical constraints (such as finite learning rate and stochastic neuronal firing) that set the limits to matching behavior. Third, although our decision model is an attractor network exhibiting winner-take-all competition, it captures graded neural spiking activities observed in LIP, when the latter were sorted according to the choices and the difference in the returns for the two targets. These results suggest that neurons in LIP are involved in selecting the oculomotor responses, whereas rewards are integrated and stored elsewhere, possibly by plastic synapses and in the form of the return rather than income of choice options.
Collapse
|
635
|
Daw ND, Doya K. The computational neurobiology of learning and reward. Curr Opin Neurobiol 2006; 16:199-204. [PMID: 16563737 DOI: 10.1016/j.conb.2006.03.006] [Citation(s) in RCA: 299] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2006] [Accepted: 03/10/2006] [Indexed: 11/22/2022]
Abstract
Following the suggestion that midbrain dopaminergic neurons encode a signal, known as a 'reward prediction error', used by artificial intelligence algorithms for learning to choose advantageous actions, the study of the neural substrates for reward-based learning has been strongly influenced by computational theories. In recent work, such theories have been increasingly integrated into experimental design and analysis. Such hybrid approaches have offered detailed new insights into the function of a number of brain areas, especially the cortex and basal ganglia. In part this is because these approaches enable the study of neural correlates of subjective factors (such as a participant's beliefs about the reward to be received for performing some action) that the computational theories purport to quantify.
Collapse
Affiliation(s)
- Nathaniel D Daw
- Gatsby Computational Neuroscience Unit, UCL, Alexandra House, 17 Queen Square, London, WC1N 3AR, UK.
| | | |
Collapse
|
636
|
Kincade JM, Abrams RA, Astafiev SV, Shulman GL, Corbetta M. An event-related functional magnetic resonance imaging study of voluntary and stimulus-driven orienting of attention. J Neurosci 2006; 25:4593-604. [PMID: 15872107 PMCID: PMC6725019 DOI: 10.1523/jneurosci.0236-05.2005] [Citation(s) in RCA: 443] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Attention can be voluntarily directed to a location or automatically summoned to a location by a salient stimulus. We compared the effects of voluntary and stimulus-driven shifts of spatial attention on the blood oxygenation level-dependent signal in humans, using a method that separated preparatory activity related to the initial shift of attention from the subsequent activity caused by target presentation. Voluntary shifts produced greater preparatory activity than stimulus-driven shifts in the frontal eye field (FEF) and intraparietal sulcus, core regions of the dorsal frontoparietal attention network, demonstrating their special role in the voluntary control of attention. Stimulus-driven attentional shifts to salient color singletons recruited occipitotemporal regions, sensitive to color information and part of the dorsal network, including the FEF, suggesting a partly overlapping circuit for endogenous and exogenous orienting. The right temporoparietal junction (TPJ), a core region of the ventral frontoparietal attention network, was strongly modulated by stimulus-driven attentional shifts to behaviorally relevant stimuli, such as targets at unattended locations. However, the TPJ did not respond to salient, task-irrelevant color singletons, indicating that behavioral relevance is critical for TPJ modulation during stimulus-driven orienting. Finally, both ventral and dorsal regions were modulated during reorienting but significantly only by reorienting after voluntary shifts, suggesting the importance of a mismatch between expectation and sensory input.
Collapse
Affiliation(s)
- J Michelle Kincade
- Department of Psychology, Washington University in St. Louis, St. Louis, Missouri 63130-4899, USA
| | | | | | | | | |
Collapse
|
637
|
Abstract
The functions of rewards are based primarily on their effects on behavior and are less directly governed by the physics and chemistry of input events as in sensory systems. Therefore, the investigation of neural mechanisms underlying reward functions requires behavioral theories that can conceptualize the different effects of rewards on behavior. The scientific investigation of behavioral processes by animal learning theory and economic utility theory has produced a theoretical framework that can help to elucidate the neural correlates for reward functions in learning, goal-directed approach behavior, and decision making under uncertainty. Individual neurons can be studied in the reward systems of the brain, including dopamine neurons, orbitofrontal cortex, and striatum. The neural activity can be related to basic theoretical terms of reward and uncertainty, such as contiguity, contingency, prediction error, magnitude, probability, expected value, and variance.
Collapse
Affiliation(s)
- Wolfram Schultz
- Department of Anatomy, University of Cambridge, CB2 3DY United Kingdom.
| |
Collapse
|
638
|
Lee D. Neural basis of quasi-rational decision making. Curr Opin Neurobiol 2006; 16:191-8. [PMID: 16531040 DOI: 10.1016/j.conb.2006.02.001] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2005] [Accepted: 02/27/2006] [Indexed: 01/22/2023]
Abstract
Standard economic theories conceive homo economicus as a rational decision maker capable of maximizing utility. In reality, however, people tend to approximate optimal decision-making strategies through a collection of heuristic routines. Some of these routines are driven by emotional processes, and others are adjusted iteratively through experience. In addition, routines specialized for social decision making, such as inference about the mental states of other decision makers, might share their origins and neural mechanisms with the ability to simulate or imagine outcomes expected from alternative actions that an individual can take. A recent surge of collaborations across economics, psychology and neuroscience has provided new insights into how such multiple elements of decision making interact in the brain.
Collapse
Affiliation(s)
- Daeyeol Lee
- Department of Brain and Cognitive Sciences, Center for Visual Science, University of Rochester, Rochester, NY 14627, USA.
| |
Collapse
|
639
|
Aston-Jones G, Cohen JD. Adaptive gain and the role of the locus coeruleus-norepinephrine system in optimal performance. J Comp Neurol 2006; 493:99-110. [PMID: 16254995 DOI: 10.1002/cne.20723] [Citation(s) in RCA: 404] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Historically, the locus coeruleus-norepinephrine (LC-NE) system has been implicated in arousal, but recent findings suggest that this system plays a more complex and specific role in the control of behavior than investigators previously thought. We review neurophysiological, anatomical, and modeling studies in monkey that support a new theory of LC-NE function. LC neurons exhibit two modes of activity, phasic and tonic. Phasic LC activation is driven by the outcome of task-related decision processes and is proposed to facilitate ensuing behaviors and to help optimize task performance. When utility in the task wanes, LC neurons exhibit a tonic activity mode, associated with disengagement from the current task and a search for alternative behaviors. Monkey LC receives prominent, direct inputs from the anterior cingulate (ACC) and orbitofrontal cortices (OFC), both of which are thought to monitor task-related utility. We propose that these prefrontal areas produce the above patterns of LC activity to optimize the utility of performance on both short and long time scales.
Collapse
Affiliation(s)
- Gary Aston-Jones
- Laboratory of Neuromodulation and Behavior, Department of Psychiatry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
| | | |
Collapse
|
640
|
MacDonall JS, Goodell J, Juliano A. Momentary maximizing and optimal foraging theories of performance on concurrent VR schedules. Behav Processes 2006; 72:283-99. [PMID: 16631321 DOI: 10.1016/j.beproc.2006.03.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Optimal foraging theory proposes that animals obtain the highest rate of reinforcers for the least effort and momentary maximizing theory proposes that animals make the response that at that instant is most likely to be reinforced. While each theory may account for matching on concurrent schedules, the data supporting each theory are weak. Two experiments assessed these theories by considering concurrent choice as consisting of two pairs of stay and switch schedules. Symmetrical arrangements, which are equivalent to standard concurrent schedules, maintained behavior described by the generalized matching law. Weighted arrangements, in which the programmed rate of earning reinforcers was always greater at one alternative, maintained behavior that was biased towards the weighted alternative, yet the bias was less than that predicted by optimal foraging theory. Asymmetrical arrangements, in which the stay and switch schedules operating at an alternative are the same, maintained behavior that favored one alternative, even though momentary maximizing predicted indifference. The generalized matching law poorly described each rat's pooled data from all conditions but these data were described by an equation based on the stay and switch reinforcers earned per-visit and included elements of optimal foraging and momentary maximizing theories of choice.
Collapse
Affiliation(s)
- James S MacDonall
- Department of Psychology, Fordham University, 441 E. Fordham Road, Bronx, NY 10458, USA.
| | | | | |
Collapse
|
641
|
Huettel SA, Stowe CJ, Gordon EM, Warner BT, Platt ML. Neural Signatures of Economic Preferences for Risk and Ambiguity. Neuron 2006; 49:765-75. [PMID: 16504951 DOI: 10.1016/j.neuron.2006.01.024] [Citation(s) in RCA: 369] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2005] [Revised: 11/08/2005] [Accepted: 01/20/2006] [Indexed: 11/21/2022]
Abstract
People often prefer the known over the unknown, sometimes sacrificing potential rewards for the sake of surety. Overcoming impulsive preferences for certainty in order to exploit uncertain but potentially lucrative options may require specialized neural mechanisms. Here, we demonstrate by functional magnetic resonance imaging (fMRI) that individuals' preferences for risk (uncertainty with known probabilities) and ambiguity (uncertainty with unknown probabilities) predict brain activation associated with decision making. Activation within the lateral prefrontal cortex was predicted by ambiguity preference and was also negatively correlated with an independent clinical measure of behavioral impulsiveness, suggesting that this region implements contextual analysis and inhibits impulsive responses. In contrast, activation of the posterior parietal cortex was predicted by risk preference. Together, this novel double dissociation indicates that decision making under ambiguity does not represent a special, more complex case of risky decision making; instead, these two forms of uncertainty are supported by distinct mechanisms.
Collapse
Affiliation(s)
- Scott A Huettel
- Brain Imaging and Analysis Center, Duke University Medical Center, Durham, North Carolina 27710, USA.
| | | | | | | | | |
Collapse
|
642
|
Abstract
Expectation of reward motivates our behaviors and influences our decisions. Indeed, neuronal activity in many brain areas is modulated by expected reward. However, it is still unclear where and how the reward-dependent modulation of neuronal activity occurs and how the reward-modulated signal is transformed into motor outputs. Recent studies suggest an important role of the basal ganglia. Sensorimotor/cognitive activities of neurons in the basal ganglia are strongly modulated by expected reward. Through their abundant outputs to the brain stem motor areas and the thalamocortical circuits, the basal ganglia appear capable of producing body movements based on expected reward. A good behavioral measure to test this hypothesis is saccadic eye movement because its brain stem mechanism has been extensively studied. Studies from our laboratory suggest that the basal ganglia play a key role in guiding the gaze to the location where reward is available. Neurons in the caudate nucleus and the substantia nigra pars reticulata are extremely sensitive to the positional difference in expected reward, which leads to a bias in excitability between the superior colliculi such that the saccade to the to-be-rewarded position occurs more quickly. It is suggested that the reward modulation occurs in the caudate where cortical inputs carrying spatial signals and dopaminergic inputs carrying reward-related signals are integrated. These data support a specific form of reinforcement learning theories, but also suggest further refinement of the theory.
Collapse
Affiliation(s)
- Okihide Hikosaka
- Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | | | | |
Collapse
|
643
|
Sanfey AG, Loewenstein G, McClure SM, Cohen JD. Neuroeconomics: cross-currents in research on decision-making. Trends Cogn Sci 2006; 10:108-16. [PMID: 16469524 DOI: 10.1016/j.tics.2006.01.009] [Citation(s) in RCA: 187] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2005] [Revised: 12/08/2005] [Accepted: 01/20/2006] [Indexed: 11/18/2022]
Abstract
Despite substantial advances, the question of how we make decisions and judgments continues to pose important challenges for scientific research. Historically, different disciplines have approached this problem using different techniques and assumptions, with few unifying efforts made. However, the field of neuroeconomics has recently emerged as an inter-disciplinary effort to bridge this gap. Research in neuroscience and psychology has begun to investigate neural bases of decision predictability and value, central parameters in the economic theory of expected utility. Economics, in turn, is being increasingly influenced by a multiple-systems approach to decision-making, a perspective strongly rooted in psychology and neuroscience. The integration of these disparate theoretical approaches and methodologies offers exciting potential for the construction of more accurate models of decision-making.
Collapse
Affiliation(s)
- Alan G Sanfey
- Department of Psychology, University of Arizona, Tucson, AZ 85721, USA.
| | | | | | | |
Collapse
|
644
|
Designing project management: A scientific notation and an improved formalism for earned value calculations. INTERNATIONAL JOURNAL OF PROJECT MANAGEMENT 2006. [DOI: 10.1016/j.ijproman.2005.07.003] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
645
|
Affiliation(s)
- B Pesaran
- Center for Neural Science, 4 Washington Pl. Rm 809, New York University, New York, New York 10003, USA.
| | | | | |
Collapse
|
646
|
Haruno M, Kawato M. Different Neural Correlates of Reward Expectation and Reward Expectation Error in the Putamen and Caudate Nucleus During Stimulus-Action-Reward Association Learning. J Neurophysiol 2006; 95:948-59. [PMID: 16192338 DOI: 10.1152/jn.00382.2005] [Citation(s) in RCA: 309] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
To select appropriate behaviors leading to rewards, the brain needs to learn associations among sensory stimuli, selected behaviors, and rewards. Recent imaging and neural-recording studies have revealed that the dorsal striatum plays an important role in learning such stimulus-action-reward associations. However, the putamen and caudate nucleus are embedded in distinct cortico-striatal loop circuits, predominantly connected to motor-related cerebral cortical areas and frontal association areas, respectively. This difference in their cortical connections suggests that the putamen and caudate nucleus are engaged in different functional aspects of stimulus-action-reward association learning. To determine whether this is the case, we conducted an event-related and computational model–based functional MRI (fMRI) study with a stochastic decision-making task in which a stimulus-action-reward association must be learned. A simple reinforcement learning model not only reproduced the subject's action selections reasonably well but also allowed us to quantitatively estimate each subject's temporal profiles of stimulus-action-reward association and reward-prediction error during learning trials. These two internal representations were used in the fMRI correlation analysis. The results revealed that neural correlates of the stimulus-action-reward association reside in the putamen, whereas a correlation with reward-prediction error was found largely in the caudate nucleus and ventral striatum. These nonuniform spatiotemporal distributions of neural correlates within the dorsal striatum were maintained consistently at various levels of task difficulty, suggesting a functional difference in the dorsal striatum between the putamen and caudate nucleus during stimulus-action-reward association learning.
Collapse
Affiliation(s)
- Masahiko Haruno
- Department of Cognitive Neuroscience Computational Neuroscience Labs, Advanced Telecommunication Research Institute, Sorakugun, Kyoto 619-0288, Japan.
| | | |
Collapse
|
647
|
Goldberg ME, Bisley JW, Powell KD, Gottlieb J. Saccades, salience and attention: the role of the lateral intraparietal area in visual behavior. PROGRESS IN BRAIN RESEARCH 2006; 155:157-75. [PMID: 17027387 PMCID: PMC3615538 DOI: 10.1016/s0079-6123(06)55010-1] [Citation(s) in RCA: 142] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Neural activity in the lateral intraparietal area (LIP) has been associated with attention to a location in visual space, and with the intention to make saccadic eye movement. In this study we show that neurons in LIP respond to recently flashed task-irrelevant stimuli and saccade targets brought into the receptive field by a saccade, although they respond much to the same stimuli when they are stable in the environment. LIP neurons respond to the appearance of a flashed distractor even when a monkey is planning a memory-guided delayed saccade elsewhere. We then show that a monkey's attention, as defined by an increase in contrast sensitivity, is pinned to the goal of a memory-guided saccade throughout the delay period, unless a distractor appears, in which case attention transiently moves to the site of the distractor and then returns to the goal of the saccade. LIP neurons respond to both the saccade goal and the distractor, and this activity correlates with the monkey's locus of attention. In particular, the activity of LIP neurons predicts when attention migrates from the distractor back to the saccade goal. We suggest that the activity in LIP provides a salience map that is interpreted by the oculomotor system as a saccade goal when a saccade is appropriate, and simultaneously is used by the visual system to determine the locus of attention.
Collapse
Affiliation(s)
- Michael E Goldberg
- Mahoney Center for Brain and Behavior, Center for Neurobiology and Behavior, Columbia University College of Physicians and Surgeons, and the New York State Psychiatric Institute, New York, NY 10032, USA.
| | | | | | | |
Collapse
|
648
|
Coulthard E, Parton A, Husain M. Action control in visual neglect. Neuropsychologia 2005; 44:2717-33. [PMID: 16368117 DOI: 10.1016/j.neuropsychologia.2005.11.004] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2005] [Revised: 11/02/2005] [Accepted: 11/03/2005] [Indexed: 10/25/2022]
Abstract
Patients with unilateral neglect show a variety of impairments when reaching towards objects in contralesional space. The basis of these deficits could be perceptual, motor or at one of the intermediate stages linking these processes. Here, we review studies of visually guided reaching in neglect and integrate these results with findings from normal human and monkey action control. We consider evidence which shows that neglect patients can be slow to initiate or execute reaches particularly to a contralesional target. We discuss the directional and spatial deficits that may interact to contribute to such reaching abnormalities and highlight the importance of effective target selection and on-line guidance, exploring the idea that deficits in these mechanisms underlie increased susceptibility to ipsilesional visual distraction in neglect. We also examine the relationship between optic ataxia and neglect by considering two illustrative cases, one with pure optic ataxia and the other with optic ataxia plus neglect, which reveal differences in the anatomical substrates of the two syndromes. We conclude that many patients with neglect make abnormal visually guided reaches, but the pattern of reaching deficits is highly variable, most likely reflecting heterogeneity of lesion location across subjects. Rather than being specific to the neglect syndrome, abnormalities of reaching in these patients may correspond to the extent of damage to the visuomotor control system which involves critical regions in both the parietal and frontal cortex, the white matter tracts connecting them and subcortical regions. Thus, the action control deficits in neglect may be conceptualised as a range of impairments affecting multiple stages in the visuomotor control process.
Collapse
Affiliation(s)
- Elizabeth Coulthard
- Division of Neuroscience and Mental Health, Imperial College London and the Institute of Cognitive Neuroscience, University College London, London, United Kingdom.
| | | | | |
Collapse
|
649
|
Orban GA, Claeys K, Nelissen K, Smans R, Sunaert S, Todd JT, Wardak C, Durand JB, Vanduffel W. Mapping the parietal cortex of human and non-human primates. Neuropsychologia 2005; 44:2647-67. [PMID: 16343560 DOI: 10.1016/j.neuropsychologia.2005.11.001] [Citation(s) in RCA: 181] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2005] [Revised: 10/13/2005] [Accepted: 11/01/2005] [Indexed: 11/24/2022]
Abstract
The present essay reviews a series of functional magnetic resonance imaging (fMRI) studies conducted in parallel in humans and awake monkeys, concentrating on the intraparietal sulcus (IPS). MR responses to a range of visual stimuli indicate that the human IPS contains more functional regions along its anterior-posterior extent than are known in the monkey. Human IPS includes four motion sensitive regions, ventral IPS (VIPS), parieto-occipital IPS (POIPS), dorsal IPS medial (DIPSM) and dorsal IPS anterior (DIPSA), which are also sensitive to three-dimensional structure from motion (3D SFM). On the other hand, the monkey IPS contains only one motion sensitive area (VIP), which is not particularly sensitive to 3D SFM. The human IPS includes four regions sensitive to two-dimensional shape and three representations of central vision, while monkey IPS appears to contain only two shape sensitive regions and one central representation. These data support the hypothesis that monkey LIP corresponds to the region of human IPS between DIPSM and POIPS and that a portion of the anterior part of human IPS is evolutionarily new. This additional cortical tissue may provide the capacity for an enhanced visual analysis of moving images necessary for sophisticated control of manipulation and tool handling.
Collapse
Affiliation(s)
- Guy A Orban
- Laboratorium voor Neuro- en Psychofysiologie, K.U.Leuven, Medical School, Leuven, Belgium.
| | | | | | | | | | | | | | | | | |
Collapse
|
650
|
Bisley JW, Goldberg ME. Neural correlates of attention and distractibility in the lateral intraparietal area. J Neurophysiol 2005; 95:1696-717. [PMID: 16339000 PMCID: PMC2365900 DOI: 10.1152/jn.00848.2005] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We examined the activity of neurons in the lateral intraparietal area (LIP) during a task in which we measured attention in the monkey, using an advantage in contrast sensitivity as our definition of attention. The animals planned a memory-guided saccade but made or canceled it depending on the orientation of a briefly flashed probe stimulus. We measured the monkeys' contrast sensitivity by varying the contrast of the probe. Both subjects had better thresholds at the goal of the saccade than elsewhere. If a task-irrelevant distractor flashed elsewhere in the visual field, the attentional advantage transiently shifted to that site. The population response in LIP correlated with the allocation of attention; the attentional advantage lay at the location in the visual field whose representation in LIP had the greatest activity when the probe appeared. During a brief period in which there were two equally active regions in LIP, there was no attentional advantage at either location. This time, the crossing point, differed in the two animals, proving a strong correlation between the activity and behavior. The crossing point of each neuron depended on the relationship of three parameters: the visual response to the distractor, the saccade-related delay activity, and the rate of decay of the transient response to the distractor. Thus the time at which attention lingers on a distractor is set by the mechanism underlying these three biophysical properties. Finally, we showed that for a brief time LIP neurons showed a stronger response to signal canceling the planned saccade than to the confirmation signal.
Collapse
Affiliation(s)
- James W Bisley
- The Laboratory of Sensorimotor Research, National Eye Institute, Bethesda, Maryland, USA
| | | |
Collapse
|