1
|
Killeen PR. Theory of reinforcement schedules. J Exp Anal Behav 2023; 120:289-319. [PMID: 37706228 DOI: 10.1002/jeab.880] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 08/04/2023] [Indexed: 09/15/2023]
Abstract
The three principles of reinforcement are (1) events such as incentives and reinforcers increase the activity of an organism; (2) that activity is bounded by competition from other responses; and (3) animals approach incentives and their signs, guided by their temporal and physical conditions, together called the "contingencies of reinforcement." Mathematical models of each of these principles comprised mathematical principles of reinforcement (MPR; Killeen, 1994). Over the ensuing decades, MPR was extended to new experimental contexts. This article reviews the basic theory and its extensions to satiation, warm-up, extinction, sign tracking, pausing, and sequential control in progressive-ratio and multiple schedules. In the latter cases, a single equation balancing target and competing responses governs behavioral contrast and behavioral momentum. Momentum is intrinsic in the fundamental equations, as behavior unspools more slowly from highly aroused responses conditioned by higher rates of incitement than it does from responses from leaner contexts. Habits are responses that have accrued substantial behavioral momentum. Operant responses, being predictors of reinforcement, are approached by making them: The sight and feel of a paw on a lever is approached by placing paw on lever, as attempted for any sign of reinforcement. Behavior in concurrent schedules is governed by approach to momentarily richer patches (melioration). Applications of MPR in behavioral pharmacology and delay discounting are noted.
Collapse
|
2
|
Paglieri F. Social choice for one: On the rationality of intertemporal decisions. Behav Processes 2016; 127:97-108. [PMID: 27118422 DOI: 10.1016/j.beproc.2016.04.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 04/14/2016] [Accepted: 04/18/2016] [Indexed: 10/21/2022]
Abstract
When faced with an intertemporal choice between a smaller short-term reward and a larger long-term prize, is opting for the latter always indicative of delay tolerance? And is delay tolerance always to be regarded as a manifestation of self-control, and thus as a rational solution to intertemporal dilemmas? I argue in favor of a negative answer to both questions, based on evidence collected in the delay discounting literature. This highlights the need for a nuanced understanding of rationality in intertemporal choice, to capture also situations in which waiting is not the optimal strategy. This paper suggests that such an understanding is fostered by adopting social choice theory as a promising framework to model intertemporal decision making. Some preliminary results of this approach are discussed, and its potential is compared with a much more studied formal model for intertemporal choice, i.e. game theory.
Collapse
Affiliation(s)
- Fabio Paglieri
- Goal-Oriented Agents Lab (GOAL), ISTC-CNR, Via San Martino della Battaglia 44, 00185 Roma, Italy.
| |
Collapse
|
3
|
Osogami T, Otsuka M. Seven neurons memorizing sequences of alphabetical images via spike-timing dependent plasticity. Sci Rep 2015; 5:14149. [PMID: 26374672 PMCID: PMC4570975 DOI: 10.1038/srep14149] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Accepted: 08/19/2015] [Indexed: 12/03/2022] Open
Abstract
An artificial neural network, such as a Boltzmann machine, can be trained with the Hebb rule so that it stores static patterns and retrieves a particular pattern when an associated cue is presented to it. Such a network, however, cannot effectively deal with dynamic patterns in the manner of living creatures. Here, we design a dynamic Boltzmann machine (DyBM) and a learning rule that has some of the properties of spike-timing dependent plasticity (STDP), which has been postulated for biological neural networks. We train a DyBM consisting of only seven neurons in a way that it memorizes the sequence of the bitmap patterns in an alphabetical image “SCIENCE” and its reverse sequence and retrieves either sequence when a partial sequence is presented as a cue. The DyBM is to STDP as the Boltzmann machine is to the Hebb rule.
Collapse
|
4
|
Burridge J, Gao Y, Mao Y. Forgetfulness can help you win games. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 92:032119. [PMID: 26465438 DOI: 10.1103/physreve.92.032119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Indexed: 06/05/2023]
Abstract
We present a simple game model where agents with different memory lengths compete for finite resources. We show by simulation and analytically that an instability exists at a critical memory length, and as a result, different memory lengths can compete and coexist in a dynamical equilibrium. Our analytical formulation makes a connection to statistical urn models, and we show that temperature is mirrored by the agent's memory. Our simple model of memory may be incorporated into other game models with implications that we briefly discuss.
Collapse
Affiliation(s)
- James Burridge
- Department of Mathematics, University of Portsmouth, Portsmouth, PO1 2UP, United Kingdom
| | - Yu Gao
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| | - Yong Mao
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
| |
Collapse
|
5
|
Catania AC, Reilly MP, Hand D, Kehle LK, Valentine L, Shimoff E. A quantitative analysis of the behavior maintained by delayed reinforcers. J Exp Anal Behav 2015; 103:288-31. [DOI: 10.1002/jeab.138] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
6
|
Killeen PR. Finding time. Behav Processes 2014; 101:154-62. [DOI: 10.1016/j.beproc.2013.08.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 06/14/2013] [Accepted: 08/06/2013] [Indexed: 11/16/2022]
|
7
|
Killeen PR, Russell VA, Sergeant JA. A behavioral neuroenergetics theory of ADHD. Neurosci Biobehav Rev 2013; 37:625-57. [PMID: 23454637 DOI: 10.1016/j.neubiorev.2013.02.011] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Revised: 02/02/2013] [Accepted: 02/18/2013] [Indexed: 02/02/2023]
Abstract
Energetic insufficiency in neurons due to inadequate lactate supply is implicated in several neuropathologies, including attention-deficit/hyperactivity disorder (ADHD). By formalizing the mechanism and implications of such constraints on function, the behavioral Neuroenergetics Theory (NeT) predicts the results of many neuropsychological tasks involving individuals with ADHD and kindred dysfunctions, and entails many novel predictions. The associated diffusion model predicts that response times will follow a mixture of Wald distributions from the attentive state, and ex-Wald distributions after attentional lapses. It is inferred from the model that ADHD participants can bring only 75-85% of the neurocognitive energy to bear on tasks, and allocate only about 85% of the cognitive resources of comparison groups. Parameters derived from the model in specific tasks predict performance in other tasks, and in clinical conditions often associated with ADHD. The primary action of therapeutic stimulants is to increase norepinephrine in active regions of the brain. This activates glial adrenoceptors, increasing the release of lactate from astrocytes to fuel depleted neurons. The theory is aligned with other approaches and integrated with more general theories of ADHD. Therapeutic implications are explored.
Collapse
Affiliation(s)
- Peter R Killeen
- Department of Psychology, Arizona State University, Tempe, AZ 85287-1104, USA.
| | | | | |
Collapse
|
8
|
Hutsell BA, Newland MC. A quantitative analysis of the effects of qualitatively different reinforcers on fixed ratio responding in inbred strains of mice. Neurobiol Learn Mem 2013; 101:85-93. [PMID: 23357283 DOI: 10.1016/j.nlm.2013.01.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Revised: 12/21/2012] [Accepted: 01/14/2013] [Indexed: 11/30/2022]
Abstract
Previous studies of inbred mouse strains have shown reinforcer-strain interactions that may potentially mask differences among strains in memory performance. The present research examined the effects of two qualitatively different reinforcers (heterogeneous mix of flavored pellets and sweetened-condensed milk) on responding maintained by fixed-ratio schedules of reinforcement in three inbred strains of mice (BALB/c, C57BL/6, and DBA/2). Responses rates for all strains were a bitonic (inverted U) function of the size of the fixed-ratio schedule and were generally higher when responding was maintained by milk. For the DBA/2 and C57BL/6 and to a lesser extent the BALB/c, milk primarily increased response rates at moderate fixed ratios, but not at the largest fixed ratios tested. A formal model of ratio-schedule performance, Mathematical Principles of Reinforcement (MPR), was applied to the response rate functions of individual mice. According to MPR, the differences in response rates maintained by pellets and milk were mostly due to changes in motoric processes as indicated by changes in the minimum response time (δ) produced by each reinforcer type and not specific activation (a), a model term that represents value and is correlated with reinforcer magnitude and the break point obtained under progressive ratio schedules. MPR also revealed that, although affected by reinforcer type, a parameter interpreted as the rate of saturation of working memory (λ), differed among the strains.
Collapse
Affiliation(s)
- Blake A Hutsell
- Department of Psychology, Auburn University, Auburn, AL 36849-5212, USA.
| | | |
Collapse
|
9
|
Abstract
It has frequently been claimed that learning performance improves with practice according to the so-called “Power Law of Learning.” Similarly, forgetting may follow a power law. It has been shown on the basis of extensive simulations that such power laws may emerge through averaging functions with other, nonpower function shapes. In the present article, we supplement these simulations with a mathematical proof that power functions will indeed emerge as a result of averaging over exponential functions, if the distribution of learning rates follows a gamma distribution, a uniform distribution, or a half-normal function. Through a number of simulations, we further investigate to what extent these findings may affect empirical results in practice.
Collapse
|
10
|
Interactions of numerical and temporal stimulus characteristics on the control of response location by brief flashes of light. Learn Behav 2011; 39:191-201. [PMID: 21267693 DOI: 10.3758/s13420-011-0016-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Pigeons pecked on three keys, responses to one of which could be reinforced after 3 flashes of the houselight, to a second key after 6, and to a third key after 12. The flashes were arranged according to variable-interval schedules. Response allocation among the keys was a function of the number of flashes. When flashes were omitted, transitions occurred very late. Increasing flash duration produced a leftward shift in the transitions along a number axis. Increasing reinforcement probability produced a leftward shift, and decreasing reinforcement probability produced a rightward shift. Intermixing different flash rates within sessions separated allocations: Faster flash rates shifted the functions sooner in real time, but later in terms of flash count, and conversely for slower flash rates. A model of control by fading memories of number and time was proposed.
Collapse
|
11
|
Models of trace decay, eligibility for reinforcement, and delay of reinforcement gradients, from exponential to hyperboloid. Behav Processes 2011; 87:57-63. [PMID: 21215304 DOI: 10.1016/j.beproc.2010.12.016] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2010] [Revised: 12/24/2010] [Accepted: 12/27/2010] [Indexed: 11/24/2022]
Abstract
Behavior such as depression of a lever or perception of a stimulus may be strengthened by consequent behaviorally significant events (BSEs), such as reinforcers. This is the Law of Effect. As time passes since its emission, the ability for the behavior to be reinforced decreases. This is trace decay. It is upon decayed traces that subsequent BSEs operate. If the trace comes from a response, it constitutes primary reinforcement; if from perception of an extended stimulus, it is classical conditioning. This paper develops simple models of these processes. It premises exponentially decaying traces related to the richness of the environment, and conditioned reinforcement as the average of such traces over the extended stimulus, yielding an almost-hyperbolic function of duration. The models account for some data, and reinforce the theories of other analysts by providing a sufficient account of the provenance of these effects. It leads to a linear relation between sooner and later isopreference delays whose slope depends on sensitivity to reinforcement, and intercept on that and the steepness of the delay gradient. Unlike human prospective judgments, all control is vested in either primary or secondary reinforcement processes; therefore the use of the term discounting, appropriate for humans, may be less descriptive of the behavior of nonverbal organisms.
Collapse
|
12
|
Fetterman JG, Killeen PR. Categorical counting. Behav Processes 2010; 85:28-35. [PMID: 20540994 DOI: 10.1016/j.beproc.2010.06.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2010] [Revised: 05/20/2010] [Accepted: 06/01/2010] [Indexed: 11/17/2022]
Abstract
Pigeons pecked on three keys, responses to one of which could be reinforced after a few pecks, to a second key after a somewhat larger number of pecks, and to a third key after the maximum pecking requirement. The values of the pecking requirements and the proportion of trials ending with reinforcement were varied. Transits among the keys were an orderly function of peck number, and showed approximately proportional changes with changes in the pecking requirements, consistent with Weber's law. Standard deviations of the switch points between successive keys increased more slowly within a condition than across conditions. Changes in reinforcement probability produced changes in the location of the psychometric functions that were consistent with models of timing. Analyses of the number of pecks emitted and the duration of the pecking sequences demonstrated that peck number was the primary determinant of choice, but that passage of time also played some role. We capture the basic results with a standard model of counting, which we qualify to account for the secondary experiments.
Collapse
|
13
|
Killeen PR, Sanabria F, Dolgov I. The dynamics of conditioning and extinction. ACTA ACUST UNITED AC 2010; 35:447-72. [PMID: 19839699 DOI: 10.1037/a0015626] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Pigeons responded to intermittently reinforced classical conditioning trials with erratic bouts of responding to the conditioned stimulus. Responding depended on whether the prior trial contained a peck, food, or both. A linear persistence-learning model moved pigeons into and out of a response state, and a Weibull distribution for number of within-trial responses governed in-state pecking. Variations of trial and intertrial durations caused correlated changes in rate and probability of responding and in model parameters. A novel prediction--in the protracted absence of food, response rates can plateau above zero--was validated. The model predicted smooth acquisition functions when instantiated with the probability of food but a more accurate jagged learning curve when instantiated with trial-to-trial records of reinforcement. The Skinnerian parameter was dominant only when food could be accelerated or delayed by pecking. These experiments provide a framework for trial-by-trial accounts of conditioning and extinction that increases the information available from the data, permitting such accounts to comment more definitively on complex contemporary models of momentum and conditioning.
Collapse
Affiliation(s)
- Peter R Killeen
- Department of Psychology, Arizona State University, Tempe, AZ 85287-1104, USA.
| | | | | |
Collapse
|
14
|
Killeen PR, Posadas-Sanchez D, Johansen EB, Thrailkill EA. Progressive ratio schedules of reinforcement. ACTA ACUST UNITED AC 2009; 35:35-50. [PMID: 19159161 DOI: 10.1037/a0012497] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Pigeons' pecks produced grain under progressive ratio (PR) schedules, whose response requirements increased systematically within sessions. Experiment 1 compared arithmetic (AP) and geometric (GP) progressions. Response rates increased as a function of the component ratio requirement, then decreased linearly (AP) or asymptotically (GP). Experiment 2 found the linear decrease in AP rates to be relatively independent of step size. Experiment 3 showed pausing to be controlled by the prior component length, which predicted the differences between PR and regressive ratio schedules found in Experiment 4. When the longest component ratios were signaled by different key colors, rates at moderate ratios increased, demonstrating control by forthcoming context. Models for response rate and pause duration based on Bizo and Killeen (1997) described performance on AP schedules; GP schedules required an additional parameter representing the contextual reinforcement.
Collapse
Affiliation(s)
- Peter R Killeen
- Department of Psychology, Arizona State University, Tempe, AZ 85287-1104, USA.
| | | | | | | |
Collapse
|
15
|
Johansen EB, Killeen PR, Russell VA, Tripp G, Wickens JR, Tannock R, Williams J, Sagvolden T. Origins of altered reinforcement effects in ADHD. Behav Brain Funct 2009; 5:7. [PMID: 19226460 PMCID: PMC2649942 DOI: 10.1186/1744-9081-5-7] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2008] [Accepted: 02/18/2009] [Indexed: 11/23/2022] Open
Abstract
Attention-deficit/hyperactivity disorder (ADHD), characterized by hyperactivity, impulsiveness and deficient sustained attention, is one of the most common and persistent behavioral disorders of childhood. ADHD is associated with catecholamine dysfunction. The catecholamines are important for response selection and memory formation, and dopamine in particular is important for reinforcement of successful behavior. The convergence of dopaminergic mesolimbic and glutamatergic corticostriatal synapses upon individual neostriatal neurons provides a favorable substrate for a three-factor synaptic modification rule underlying acquisition of associations between stimuli in a particular context, responses, and reinforcers. The change in associative strength as a function of delay between key stimuli or responses, and reinforcement, is known as the delay of reinforcement gradient. The gradient is altered by vicissitudes of attention, intrusions of irrelevant events, lapses of memory, and fluctuations in dopamine function. Theoretical and experimental analyses of these moderating factors will help to determine just how reinforcement processes are altered in ADHD. Such analyses can only help to improve treatment strategies for ADHD.
Collapse
Affiliation(s)
- Espen Borgå Johansen
- Centre for Advanced Study (CAS) at the Norwegian Academy for Science and Letters, Oslo, Norway.
| | | | | | | | | | | | | | | |
Collapse
|
16
|
|
17
|
Abstract
Metacognition is thinking about thinking. There is considerable interest in developing animal models of metacognition to provide insight about the evolution of mind and a basis for investigating neurobiological mechanisms of cognitive impairments in people. Formal modeling of low-level (i.e., alternative) mechanisms has recently demonstrated that prevailing standards for documenting metacognition are inadequate. Indeed, low-level mechanisms are sufficient to explain data from existing methods. Consequently, an assessment of what is 'lost' (in terms of existing methods and data) necessitates the development of new, innovative methods for metacognition. Development of new methods may prompt the establishment of new standards for documenting metacognition.
Collapse
Affiliation(s)
| | - Allison L Foote
- Department of Psychology, University of Georgia, Athens GA 30602-3013
| |
Collapse
|
18
|
Johansen EB, Killeen PR, Sagvolden T. Behavioral variability, elimination of responses, and delay-of-reinforcement gradients in SHR and WKY rats. Behav Brain Funct 2007; 3:60. [PMID: 18028539 PMCID: PMC2219961 DOI: 10.1186/1744-9081-3-60] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2006] [Accepted: 11/20/2007] [Indexed: 12/03/2022] Open
Abstract
Background Attention-deficit/hyperactivity disorder (ADHD) is characterized by a pattern of inattention, hyperactivity, and impulsivity that is cross-situational, persistent, and produces social and academic impairment. Research has shown that reinforcement processes are altered in ADHD. The dynamic developmental theory has suggested that a steepened delay-of-reinforcement gradient and deficient extinction of behavior produce behavioral symptoms of ADHD and increased behavioral variability. Method The present study investigated behavioral variability and elimination of non-target responses during acquisition in an animal model of ADHD, the spontaneously hypertensive rat (SHR), using Wistar Kyoto (WKY) rats as controls. The study also aimed at providing a novel approach to measuring delay-of-reinforcement gradients in the SHR and the WKY strains. The animals were tested in a modified operant chamber presenting 20 response alternatives. Nose pokes in a target hole produced water according to fixed interval (FI) schedules of reinforcement, while nose pokes in the remaining 19 holes either had no consequences or produced a sound or a short flickering of the houselight. The stimulus-producing holes were included to test whether light and sound act as sensory reinforcers in SHR. Data from the first six sessions testing FI 1 s were used for calculation of the initial distribution of responses. Additionally, Euclidean distance (measured from the center of each hole to the center of the target hole) and entropy (a measure of variability) were also calculated. Delay-of-reinforcement gradients were calculated across sessions by dividing the fixed interval into epochs and determining how much reinforcement of responses in one epoch contributed to responding in the next interval. Results Over the initial six sessions, behavior became clustered around the target hole. There was greater initial variability in SHR behavior, and slower elimination of inefficient responses compared to the WKY. There was little or no differential use of the stimulus-producing holes by either strain. For SHR, the reach of reinforcement (the delay-of-reinforcement gradient) was restricted to the preceding one second, whereas for WKY it extended about four times as far. Conclusion The present findings support previous studies showing increased behavioral variability in SHR relative to WKY controls. A possibly related phenomenon may be the slowed elimination of non-operant nose pokes in SHR observed in the present study. The findings provide support for a steepened delay-of-reinforcement gradient in SHR as suggested in the dynamic developmental theory of ADHD. Altered reinforcement processes characterized by a steeper and shorter delay-of-reinforcement gradient may define an ADHD endophenotype.
Collapse
Affiliation(s)
- Espen B Johansen
- Department of Physiology, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway.
| | | | | |
Collapse
|
19
|
Abstract
Pigeons were tested in a successive same-different (S/D) discrimination procedure to examine the short-term memory for individual items in sequences of different or identical pictures. Item-by-item analyses of pecking behavior within single trials revealed this S/D discrimination emerged at the earliest possible point in the sequence--the presentation of the second item. Further, by comparing peck rates at points where different types of sequences diverged (e.g. ABA versus ABC), we determined that the pigeons remembered the first item for at least 4-8s and across one to two intervening items. These results indicate that this S/D discrimination was controlled by relational comparisons of pictorial content across memories of specific items, rather than the detection of low-level perceptual "transients" between items. A second experiment supported this conclusion by showing increased discrimination with longer first item viewing times, consistent with encoding of details about individual pictures. These findings further support a qualitative similarity among birds and primates in possessing a general capacity to judge certain types of stimulus relations, such as stimulus identity and difference. Implications for the temporal continuity of experience in animals are also considered.
Collapse
Affiliation(s)
- Robert G Cook
- Department of Psychology, Tufts University, MA 02155, USA.
| | | |
Collapse
|
20
|
Cheng RK, Meck WH, Williams CL. alpha7 Nicotinic acetylcholine receptors and temporal memory: synergistic effects of combining prenatal choline and nicotine on reinforcement-induced resetting of an interval clock. Learn Mem 2006; 13:127-34. [PMID: 16547161 PMCID: PMC1409834 DOI: 10.1101/lm.31506] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2005] [Accepted: 12/06/2005] [Indexed: 11/25/2022]
Abstract
We previously showed that prenatal choline supplementation could increase the precision of timing and temporal memory and facilitate simultaneous temporal processing in mature and aged rats. In the present study, we investigated the ability of adult rats to selectively control the reinforcement-induced resetting of an internal clock as a function of prenatal drug treatments designed to affect the alpha7 nicotinic acetylcholine receptor (alpha7 nAChR). Male Sprague-Dawley rats were exposed to prenatal choline (CHO), nicotine (NIC), methyllycaconitine (MLA), choline + nicotine (CHO + NIC), choline + nicotine + methyllycaconitine (CHO + NIC + MLA), or a control treatment (CON). Beginning at 4-mo-of-age, rats were trained on a peak-interval timing procedure in which food was available at 10-, 30-, and 90-sec criterion durations. At steady-state performance there were no differences in timing accuracy, precision, or resetting among the CON, MLA, and CHO + NIC + MLA treatments. It was observed that the CHO and NIC treatments produced a small, but significant increase in timing precision, but no change in accuracy or resetting. In contrast, the CHO + NIC prenatal treatment produced a dramatic increase in timing precision and selective control of the resetting mechanism with no change in overall timing accuracy. The synergistic effect of combining prenatal CHO and NIC treatments suggests an organizational change in alpha7 nAChR function that is dependent upon a combination of selective and nonselective nAChR stimulation during early development.
Collapse
Affiliation(s)
- Ruey-Kuang Cheng
- Department of Psychological and Brain Sciences, Duke University, Durham, North Carolina 27708, USA
| | | | | |
Collapse
|
21
|
Brown S, Heathcote A. Practice increases the efficiency of evidence accumulation in perceptual choice. J Exp Psychol Hum Percept Perform 2005; 31:289-98. [PMID: 15826231 DOI: 10.1037/0096-1523.31.2.289] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Most models of choice response time base decisions on evidence accumulated over time. A fundamental distinction among these models concerns whether each piece of evidence is equally weighted (lossless accumulation) or unequally weighted (leaky accumulation). The authors tested a hypothesis derived from A. Heathcote and S. Brown's (2002) self-exciting expert competitor (SEEXC) model of skill acquisition: that evidence accumulation becomes less leaky with practice. The hypothesis was supported by observation that the effects of prime stimuli increased with practice. The authors used metacontrast masked primes, which could not be consciously discriminated by most participants, to avoid methodological problems associated with conscious strategy changes. The form of the law of practice in the data is also shown to be consistent with the SEEXC model.
Collapse
Affiliation(s)
- Scott Brown
- Department of Cognitive Sciences, University of California, Irvine, CA 92697-5100, USA.
| | | |
Collapse
|
22
|
Abstract
In Skinner's Reflex Reserve theory, reinforced responses added to a reserve depleted by responding. It could not handle the finding that partial reinforcement generated more responding than continuous reinforcement, but it would have worked if its growth had depended not just on the last response but also on earlier responses preceding a reinforcer, each weighted by delay. In that case, partial reinforcement generates steady states in which reserve decrements produced by responding balance increments produced when reinforcers follow responding. A computer simulation arranged schedules for responses produced with probabilities proportional to reserve size. Each response subtracted a fixed amount from the reserve and added an amount weighted by the reciprocal of the time to the next reinforcer. Simulated cumulative records and quantitative data for extinction, random-ratio, random-interval, and other schedules were consistent with those of real performances, including some effects of history. The model also simulated rapid performance transitions with changed contingencies that did not depend on molar variables or on differential reinforcement of inter-response times. The simulation can be extended to inhomogeneous contingencies by way of continua of reserves arrayed along response and time dimensions, and to concurrent performances and stimulus control by way of different reserves created for different response classes.
Collapse
Affiliation(s)
- A Charles Catania
- Department of Psychology [corrected] University of Maryland [corrected] Baltimore County (UMBC), Baltimore [corrected] MD 21250 [corrected] USA. [corrected]
| |
Collapse
|
23
|
Sargisson RJ, White KG. On the form of the forgetting function: the effects of arithmetic and logarithmic distributions of delays. J Exp Anal Behav 2004; 80:295-309. [PMID: 14964709 PMCID: PMC1284961 DOI: 10.1901/jeab.2003.80-295] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Forgetting functions with 18 delay intervals were generated for delayed matching-to-sample performance in pigeons. Delay interval variation was achieved by arranging five different sets of five delays across daily sessions. In different conditions, the delays were distributed in arithmetic or logarithmic series. There was no convincing evidence for different effects on discriminability of the distributions of different delays. The mean data were better fitted by some mathematical functions than by others, but the best-fitting functions depended on the distribution of delays. In further conditions with a fixed set of five delays, discriminability was higher with a logarithmic distribution of delays than with an arithmetic distribution. This result is consistent with the treatment of the forgetting function in terms of generalization decrement.
Collapse
|
24
|
Abstract
Mathematical Principles of Reinforcement (MPR) is a theory of reinforcement schedules. This paper reviews the origin of the principles constituting MPR: arousal, association and constraint. Incentives invigorate responses, in particular those preceding and predicting the incentive. The process that generates an associative bond between stimuli, responses and incentives is called coupling. The combination of arousal and coupling constitutes reinforcement. Models of coupling play a central role in the evolution of the theory. The time required to respond constrains the maximum response rates, and generates a hyperbolic relation between rate of responding and rate of reinforcement. Models of control by ratio schedules are developed to illustrate the interaction of the principles. Correlations among parameters are incorporated into the structure of the models, and assumptions that were made in the original theory are refined in light of current data.
Collapse
Affiliation(s)
- Peter R. Killeen
- Corresponding author. Tel.: +1-602-9652555; fax: +1-602-9658544. E-mail address: (P.R. Killeen)
| | | |
Collapse
|
25
|
Cook RG, Kelly DM, Katz JS. Successive two-item same-different discrimination and concept learning by pigeons. Behav Processes 2003; 62:125-144. [PMID: 12729974 DOI: 10.1016/s0376-6357(03)00022-6] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Four pigeons were trained in a successive same/different procedure involving the alternation of two stimuli per trial. Using a go/no-go procedure, two different or two identical color photographs were alternated, with a brief, dark, inter-stimulus interval, on a computer screen for 20s. Pigeons learned to discriminate between same (S+) and different (D-) sequences with moderate to large contrasts between successive pictures. Analyses of pecking behavior within single trials revealed this discrimination emerged at the earliest possible point in the sequence (i.e. by the presentation of the second item). Pigeons transferred to novel color and gray-scale pictures, and showed savings in tests with novel video stimuli. These results suggest that same/different discrimination and concept formation can be acquired with successively presented pairs of stimuli by pigeons. When combined with results using simultaneous same/different presentations, these findings further support a qualitative similarity among birds and primates in their capacity to judge certain types of stimulus relations.
Collapse
Affiliation(s)
- Robert G. Cook
- Department of Psychology, 530 Bacon Hall, Tufts University, 02155, Medford, MA, USA
| | | | | |
Collapse
|
26
|
Killeen PR. Complex dynamic processes in sign tracking with an omission contingency (negative automaintenance). JOURNAL OF EXPERIMENTAL PSYCHOLOGY. ANIMAL BEHAVIOR PROCESSES 2003; 29:49-61. [PMID: 12561133 PMCID: PMC2643130 DOI: 10.1037/0097-7403.29.1.49] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Hungry pigeons received food periodically, signaled by the onset of a keylight. Key pecks aborted the feeding. Subjects responded for thousands of trials, despite the contingent nonreinforcement, with varying probability as the intertrial interval was varied. Hazard functions showed the dominant tendency to be perseveration in responding and not responding. Once perseveration was accounted for, a linear operator model of associative conditioning further improved predictions. Response rates during trials were correlated with the prior probabilities of a response. Rescaled range analyses showed that the behavioral trajectories were a kind of fractional Brownian motion.
Collapse
Affiliation(s)
- Peter R Killeen
- Department of Psychology, Arizona State University, Box 1104, Tempe, Arizona 85287-1104, USA.
| |
Collapse
|
27
|
Killeen PR, Hall SS, Reilly MP, Kettle LC. Molecular analyses of the principal components of response strength. J Exp Anal Behav 2002; 78:127-60. [PMID: 12216975 PMCID: PMC1284892 DOI: 10.1901/jeab.2002.78-127] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Killeen and Hall (2001) showed that a common factor called strength underlies the key dependent variables of response probability, latency, and rate, and that overall response rate is a good predictor of strength. In a search for the mechanisms that underlie those correlations, this article shows that (a) the probability of responding on a trial is a two-state Markov process; (b) latency and rate of responding can be described in terms of the probability and period of stochastic machines called clocked Bernoulli modules, and (c) one such machine, the refractory Poisson process, provides a functional relation between the probability of observing a response during any epoch and the rate of responding. This relation is one of proportionality at low rates and curvilinearity at higher rates.
Collapse
Affiliation(s)
- Peter R Killeen
- Department of Psychology, Arizona State University, Tempe 85287-1104, USA.
| | | | | | | |
Collapse
|
28
|
Machado A, Keen R. Relative numerosity discrimination in the pigeon: further tests of the linear-exponential-ratio model. Behav Processes 2002; 57:131-148. [PMID: 11947994 DOI: 10.1016/s0376-6357(02)00010-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This study tested a model of how animals discriminate the relative numerosity of stimuli in successive or sequential presentation tasks. In a discrete-trials procedure, pigeons were shown one light for nf times and then another for nl times. Next they received food for choosing the light that had occurred the least-number of times during the sample. At issue were (a) how performance varies with the interval between the two stimulus sets (the interblock interval) and the interval between the end of the sample and the beginning of the choice period (the retention interval); and (b) whether a simple mathematical model of the discrimination process could account for the data. The model assumed that the influence of a stimulus on choice increases linearly when the stimulus is presented, but decays exponentially when the stimulus is absent; choice probability is given by the ratio of the influence values of the two stimuli. The model also assumed that as the retention interval elapses there is an increasing probability that the ongoing discriminative process be disrupted and then the animal responds randomly. Results showed that increasing the interblock intervals reduced the probability of choosing the last stimulus of the sample as the least-frequent one. Increasing the retention interval reduced accuracy without inducing any stimulus bias. The model accounted well for the major trends in the data.
Collapse
Affiliation(s)
- Armando Machado
- Instituto de Educação e Psicologia, Universidade do Minho, 4710, Braga, Portugal
| | | |
Collapse
|
29
|
Abstract
Two processes may contribute to the decrement in discriminability with increasing temporal distance between the occasioning event and later choice. One is the length of the interval and the other is generalization decrement. In the model described by White and Wixted [J. Exp. Anal. Behav. 71 (1999) 91-113], choice was predicted by the relative payoff for correct delayed matching responses, conditional on the current value of the stimulus sampled from Thurstone-like probability distributions of the effect of the sample stimuli. In the model, discriminability decreased with increasing temporal distance because the variance of the distributions increased with time. However, White and Wixted did not specify the function relating variance to temporal distance. If a diffusion process is assumed, and if diffusion increases exponentially with time, the resulting forgetting function is a negative exponential. An additional process involves exponential generalization of remembering from one time to other times. Alternative diffusion functions result in hyperbolic or power forgetting functions. The combination of two exponential processes yields forgetting functions that are double exponential in form and which appear consistent with a wide range of data.
Collapse
Affiliation(s)
- K Geoffrey White
- Department of Psychology, University of Otago, Dunedin, New Zealand
| |
Collapse
|
30
|
|