1
|
Gallistel CR, Shahan TA. Time-scale invariant contingency yields one-shot reinforcement learning despite extremely long delays to reinforcement. Proc Natl Acad Sci U S A 2024; 121:e2405451121. [PMID: 39008663 PMCID: PMC11287270 DOI: 10.1073/pnas.2405451121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 06/06/2024] [Indexed: 07/17/2024] Open
Abstract
Reinforcement learning inspires much theorizing in neuroscience, cognitive science, machine learning, and AI. A central question concerns the conditions that produce the perception of a contingency between an action and reinforcement-the assignment-of-credit problem. Contemporary models of associative and reinforcement learning do not leverage the temporal metrics (measured intervals). Our information-theoretic approach formalizes contingency by time-scale invariant temporal mutual information. It predicts that learning may proceed rapidly even with extremely long action-reinforcer delays. We show that rats can learn an action after a single reinforcement, even with a 16-min delay between the action and reinforcement (15-fold longer than any delay previously shown to support such learning). By leveraging metric temporal information, our solution obviates the need for windows of associability, exponentially decaying eligibility traces, microstimuli, or distributions over Bayesian belief states. Its three equations have no free parameters; they predict one-shot learning without iterative simulation.
Collapse
Affiliation(s)
- Charles R. Gallistel
- Department of Psychology & Rutgers Center for Cognitive Sciences, Rutgers The State University of New Jersey, Piscataway, NJ08854-8020
| | - Timothy A. Shahan
- Department of Psychology, Utah State University, Logan, UT84322-2810
| |
Collapse
|
2
|
Ribes-Iñesta E, Hernández V, Serrano M. Temporal contingencies are dependent on space location: Distal and proximal concurrent water schedules. Behav Processes 2020; 181:104256. [PMID: 33161069 DOI: 10.1016/j.beproc.2020.104256] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 09/09/2020] [Accepted: 09/23/2020] [Indexed: 10/23/2022]
Abstract
Two studies evaluated the effect of delivering water depending on lever-pressing in a proximal or distant location to the water-producing response. Effects were measured on the spatial distribution of behavior and on the frequency and patterning of lever pressing. In both experiments water was available under two concurrent, complementary fixed interval schedules in two dispensers located at opposite ends of the chamber. The proportion of water deliveries in one dispenser relative to the second dispenser varied between phases, while the overall frequency was kept constant. In one study rats received water from a dispenser proximal to the water producing response location, whereas in the second study rats received water in the dispenser at the opposite panel where the response was emitted. The number of obtained water deliveries varied according to the programmed proportion, but rats obtained fewer deliveries under the distal location contingency. No systematic variations on space allocation were observed in neither experiment. The results are discussed in terms of the importance of considering the continuous interaction of time and space parameters in the analysis of behavior.
Collapse
|
3
|
Reed P. Human free-operant performance varies with a concurrent task: Probability learning without a task, and schedule-consistent with a task. Learn Behav 2020; 48:254-273. [PMID: 31898165 PMCID: PMC7275008 DOI: 10.3758/s13420-019-00398-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Three experiments examined human rates and patterns of responding during exposure to various schedules of reinforcement with or without a concurrent task. In the presence of the concurrent task, performances were similar to those typically noted for nonhumans. Overall response rates were higher on medium-sized ratio schedules than on smaller or larger ratio schedules (Experiment 1), on interval schedules with shorter than longer values (Experiment 2), and on ratio compared with interval schedules with the same rate of reinforcement (Experiment 3). Moreover, bout-initiation responses were more susceptible to influence by rates of reinforcement than were within-bout responses across all experiments. In contrast, in the absence of a concurrent task, human schedule performance did not always display characteristics of nonhuman performance, but tended to be related to the relationship between rates of responding and reinforcement (feedback function), irrespective of the schedule of reinforcement employed. This was also true of within-bout responding, but not bout-initiations, which were not affected by the presence of a concurrent task. These data suggest the existence of two strategies for human responding on free-operant schedules, relatively mechanistic ones that apply to bout-initiation, and relatively explicit ones, that tend to apply to within-bout responding, and dominate human performance when other demands are not made on resources.
Collapse
Affiliation(s)
- Phil Reed
- Department of Psychology, Swansea University, Singleton Park, Swansea, SA2 8PP, UK.
| |
Collapse
|
4
|
Continuous Measuring of Temporal and Spatial Changes in Rats’ Behavior under Water Temporal Schedules. PSYCHOLOGICAL RECORD 2020. [DOI: 10.1007/s40732-020-00389-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
5
|
Walters K, Thomson K. The history of behavior analysis in manitoba: a sparsely populated canadian province with an international influence on behavior analysis. THE BEHAVIOR ANALYST 2013; 36:57-72. [PMID: 25729132 PMCID: PMC3640888 DOI: 10.1007/bf03392292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
This article examines the convergence of factors that led to behavior analysis taking root, flourishing, and bearing fruit in a prairie province of Canada. In the latter half of the 1960s, Garry Martin and Joseph Pear began teaching behavior-analytic courses at the University of Manitoba. They and their students then initiated behavioral treatment and research programs at the Manitoba Developmental Center and St.Amant, the two main residential facilities for persons with intellectual disabilities and autism. Since that time, behavior analysis in Manitoba has flourished, and the knowledge and skills gained have been shared with other behavior analysts throughout the world through conferences, articles, and books. Behavior-analytic books by authors who live and work in Manitoba have been translated into eight languages. Moreover, University of Manitoba graduates in behavior analysis have helped to spread knowledge of behavior analysis throughout the world, and a number have achieved highly influential positions and widespread recognition within the discipline.
Collapse
|
6
|
Mazur JE, Hyslop ME. Fixed-ratio performance with and without a postreinforcement timeout. J Exp Anal Behav 2010; 38:143-55. [PMID: 16812293 PMCID: PMC1347809 DOI: 10.1901/jeab.1982.38-143] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Pigeons pecked a key, producing food reinforcement on fixed-ratio (FR) schedules requiring 50, 100, or 150 responses. In each session, 30-second timeouts were inserted before a random half of the FR trials, whereas the other trials began immediately after reinforcement. In general, preratio pauses were shorter on trials preceded by timeouts. On these trials, the probability of a first response tended to be highest in the first 20 seconds of the trials, suggesting that the shorter pauses were the result of transient behavioral contrast. Direct observations and analyses of interresponse times (IRTs) after the preratio pause indicated that IRTs could be grouped into three categories: (1) IRTs of about .1 second, which were produced by small head movements in the vicinity of the key; (2) IRTs of about .3 second, which were produced by distinct pecking motions; and (3) IRTs greater than .5 second, which were accompanied by pausing or movements away from the key. At all ratio sizes, as a subject progressed through a trial, the probability of a long IRT decreased, whereas the probability of an intermediate IRT usually increased at first and then decreased. The probability of a short IRT increased monotonically across a trial. The results show that responding changes systematically as a subject progresses through a ratio on an FR schedule. Some characteristics of performance varied as functions of the absolute size of the response requirement, whereas others appeared to depend on the relative location within a ratio (i.e., the proportion of the ratio completed at a given moment).
Collapse
|
7
|
Pear JJ. Spatiotemporal patterns of behavior produced by variable-interval schedules of reinforcement. J Exp Anal Behav 2010; 44:217-31. [PMID: 16812432 PMCID: PMC1348179 DOI: 10.1901/jeab.1985.44-217] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The spatiotemporal patterns of behavior exhibited by two pigeons during a variable-interval 15-second schedule of food reinforcement, a variable-interval 5-minute schedule, and then extinction of key pecking were recorded using an apparatus that continuously tracked the position of the bird in the experimental chamber. The variable-interval 15-second schedule produced a close-to-key pattern between reinforcements with two types of regular excursions from the region of the key frequently occurring after reinforcement. Subsequent exposure to the variable-interval 5-minute schedule produced more extended and extremely regular patterns between responses. Reinstatement of the variable-interval 15-second schedule reestablished the close-to-key pattern with regular excursions frequently occurring after reinforcement. During extinction the spatiotemporal patterns that had developed during the variable-interval 5-minute schedule reappeared and gradually dissipated. These patterns may have been a form of superstitious behavior.
Collapse
|
8
|
Abstract
Three groups of rats pressed a lever for milk reinforcers on various simple reinforcement schedules (one schedule per condition). In Group M, each pair of conditions included a mixed-ratio schedule and a fixed-ratio schedule with equal average response:reinforcer ratios. On mixed-ratio schedules, reinforcement occurred with equal probability after a small or a large response requirement was met. In Group R, fixed-ratio and random-ratio schedules were compared in each pair of conditions. For all subjects in these two groups, the frequency distributions of interresponse times of less than one second were very similar on all ratio schedules, exhibiting a peak at about .2 seconds. For comparison, subjects in Group V responded on variable-interval schedules, and few interresponse times as short as .2 seconds were recorded. The results suggest that the rate of continuous responding is the same on all ratio schedules, and what varies among ratio schedules is the frequency, location, and duration of pauses. Preratio pauses were longer on fixed-ratio schedules than on mixed-ratio or random-ratio schedules, but there was more within-ratio pausing on mixed-ratio and random-ratio schedules. Across a single trial, the probability of an interruption in responding decreased on fixed-ratio schedules, was roughly constant on random-ratio schedules, and often increased and then decreased on mixed-ratio schedules. These response patterns provided partial support for Mazur's (1982) theory that the probability of instrumental responding is directly related to the probability of reinforcement and the proximity of reinforcement.
Collapse
|
9
|
Abstract
On a given variable-interval schedule, the average obtained rate of reinforcement depends on the average rate of responding. An expression for this feedback effect is derived from the assumptions that free-operant responding occurs in bursts with a constant tempo, alternating with periods of engagement in other activities; that the durations of bursts and other activities are exponentially distributed; and that the rates of initiating and terminating bursts are inversely related. The expression provides a satisfactory account of the data of three experiments.
Collapse
|
10
|
Abstract
A model of performance under concurrent variable-interval reinforcement schedules that takes as its starting point the hypothetical "burst" structure of operant responding is presented. Undermatching and overmatching are derived from two separate, and opposing, tendencies. The first is a tendency to allocate a certain proportion of response bursts randomly to a response alternative without regard for the rate of reinforcement it provides, others being allocated according to the simple matching law. This produces undermatching. The second is a tendency to prolong response bursts that have a high probability of initiation relative to those for which initiation probability is lower. This process produces overmatching. A model embodying both tendencies predicts (1) that undermatching will be more common than overmatching, (2) that overmatching, when it occurs, will tend to be of limited extent. Both predictions are consistent with available data. The model thus accounts for undermatching and overmatching deviations from the matching law in terms of additional processes added on to behavior allocation obeying the simple matching relation. Such a model thus enables processes that have been hypothesized to underlie matching, such as some type of reinforcement rate or probability optimization, to remain as explanatory mechanisms even though the simple matching law may not generally be obeyed.
Collapse
|
11
|
Abstract
Behavior dynamics is a field devoted to analytic descriptions of behavior change. A principal source of both models and methods for these descriptions is found in physics. This approach is an extension of a long conceptual association between behavior analysis and physics. A theme common to both is the role of molar versus molecular events in description and prediction. Similarities and differences in how these events are treated are discussed. Two examples are presented that illustrate possible correspondence between mechanical and behavioral systems. The first demonstrates the use of a mechanical model to describe the molar properties of behavior under changing reinforcement conditions. The second, dealing with some features of concurrent schedules, focuses on the possible utility of nonlinear dynamical systems to the description of both molar and molecular behavioral events as the outcome of a deterministic, but chaotic, process.
Collapse
|
12
|
Davison M, Elliffe D. Variance matters: The shape of a datum. Behav Processes 2009; 81:216-22. [DOI: 10.1016/j.beproc.2009.01.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2008] [Revised: 12/27/2008] [Accepted: 01/21/2009] [Indexed: 10/21/2022]
|
13
|
Podlesnik CA, Jimenez-Gomez C, Ward RD, Shahan TA. Resistance to change of responding maintained by unsignaled delays to reinforcement: a response-bout analysis. J Exp Anal Behav 2006; 85:329-47. [PMID: 16776055 PMCID: PMC1459851 DOI: 10.1901/jeab.2006.47-05] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Previous experiments have shown that unsignaled delayed reinforcement decreases response rates and resistance to change. However, the effects of different delays to reinforcement on underlying response structure have not been investigated in conjunction with tests of resistance to change. In the present experiment, pigeons responded on a three-component multiple variable-interval schedule for food presented immediately, following brief (0.5 s), or following long (3 s) unsignaled delays of reinforcement. Baseline response rates were lowest in the component with the longest delay; they were about equal with immediate and briefly delayed reinforcers. Resistance to disruption by presession feeding, response-independent food during the intercomponent interval, and extinction was slightly but consistently lower as delays increased. Because log survivor functions of interresponse times (IRTs) deviated from simple modes of bout initiations and within-bout responding, an IRT-cutoff method was used to examine underlying response structure. These analyses suggested that baseline rates of initiating bouts of responding decreased as scheduled delays increased, and within-bout response rates tended to be lower in the component with immediate reinforcers. The number of responses per bout was not reliably affected by reinforcer delay, but tended to be highest with brief delays when total response rates were higher in that component. Consistent with previous findings, resistance to change of overall response rate was highly correlated with resistance to change of bout-initiation rates but not with within-bout responding. These results suggest that unsignaled delays to reinforcement affect resistance to change through changes in the probability of initiating a response bout rather than through changes in the underlying response structure.
Collapse
|
14
|
Ribes-Iñesta E, Torres C, Correa L, Montes E. Effects of concurrent random-time schedules on the spatial distribution of behavior in rats. Behav Processes 2006; 73:41-8. [PMID: 16530984 DOI: 10.1016/j.beproc.2006.02.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2005] [Revised: 02/07/2006] [Accepted: 02/07/2006] [Indexed: 11/21/2022]
Abstract
We evaluated the effects of concurrent random temporal contingencies of water delivery on the location of rats' behavior. In this experiment, two concurrent random-time schedules delivered water in two dispensers that were located at opposite ends of the chamber, and provided complementary frequencies of water deliveries while the overall number of deliveries stayed constant. Time allocation in the areas adjacent to each water dispenser covaried with water deliveries, although no proportional relation was found. Rats showed a preference for the area where water was initially presented. The results point to the importance of examining spatial properties and patterning of behavior.
Collapse
|
15
|
Abstract
Rats obtained food pellets on a variable-interval schedule of reinforcement by nose poking a lighted key. After training to establish baseline performance (with the mean variable interval set at either 60, 120, or 240 s), the rats were given free access to food during the hour just before their daily session. This satiation operation reduced the rate of key poking. Analysis of the interresponse time distributions (log survivor plots) indicated that key poking occurred in bouts. Prefeeding lengthened the pauses between bouts, shortened the length of bouts (less reliably), and had a relatively small decremental effect on the response rate within bouts. That deprivation level affects mainly between-bout pauses has been reported previously with fixed-ratio schedules. Thus, when the focus is on bouts, the performances maintained by variable-interval schedules and fixed-ratio schedules are similarly affected by deprivation.
Collapse
Affiliation(s)
- Richard L Shull
- Department of Psychology, The University of North Carolina at Greensboro, 27402-6170, USA.
| |
Collapse
|
16
|
Shull RL, Grimes JA, Bennett JA. Bouts of responding: the relation between bout rate and the rate of variable-interval reinforcement. J Exp Anal Behav 2004; 81:65-83. [PMID: 15113134 PMCID: PMC1284972 DOI: 10.1901/jeab.2004.81-65] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
By nose poking a lighted key, rats obtained food pellets on either a variable-interval schedule of reinforcement or a schedule that required an average of four additional responses after the end of tile variable-interval component (a tandem variable-interval variable-ratio 4 schedule). With both schedule types, the mean variable interval was varied between blocks of sessions from 16 min to 0.25 min. Total rate of key poking increased similarly as a function of the reinforcer rate for the two schedule types, but response rate was higher with than without the four-response requirement. Analysis of log survivor plots of interresponse times showed that key poking occurred in bouts. The rate of initiating bouts increased as a function of reinforcer rate but was either unaffected or was decreased by adding the four-response requirement. Within-bout response rate was insensitive to reinforcer rate and only inconsistently affected by the four-response requirement. For both kinds of schedule, the ratio of bout time to between-bout pause time was approximately a power function of reinforcer rate, with exponents above and below 1.0.
Collapse
Affiliation(s)
- Richard L Shull
- Department of Psychology, Box 26170, The University of North Carolina-Greensboro, Greensboro, North Carolina 27402-6170, USA.
| | | | | |
Collapse
|
17
|
Shull RL, Grimes JA. Bouts of responding from variable-interval reinforcement of lever pressing by rats. J Exp Anal Behav 2004; 80:159-71. [PMID: 14674726 PMCID: PMC1284951 DOI: 10.1901/jeab.2003.80-159] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Four rats obtained food pellets by lever pressing. A variable-interval reinforcement schedule assigned reinforcers on average every 2 min during one block of 20 sessions and on average every 8 min during another block. Also, at each variable-interval duration, a block of sessions was conducted with a schedule that imposed a variable-ratio 4 response requirement after each variable interval (i.e., a tandem variable-time variable-ratio 4 schedule). The total rate of lever pressing increased as a function of the rate of reinforcement and as a result of imposing the variable-ratio requirement. Analysis of log survivor plots of interresponse times indicated that lever pressing occurred in bouts that were separated by pauses. Increasing the rate of reinforcement increased total response rate by increasing the rate of initiating bouts and, less reliably, by lengthening bouts. Imposing the variable-ratio component increased response rate mainly by lengthening bouts. This pattern of results is similar to that reported previously with key poking as the response. Also, response rates within bouts were relatively insensitive to either variable.
Collapse
Affiliation(s)
- Richard L Shull
- Department of Psychology, The University of North Carolina at Greensboro, 27402-6170, USA.
| | | |
Collapse
|
18
|
Reed P, Hildebrandt T, DeJongh J, Soh M. Rats' performance on variable-interval schedules with a linear feedback loop between response rate and reinforcement rate. J Exp Anal Behav 2003; 79:157-73. [PMID: 12822684 PMCID: PMC1284927 DOI: 10.1901/jeab.2003.79-157] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Three experiments investigated whether rats are sensitive to the molar properties of a variable-interval (VI) schedule with a positive relation between response rate and reinforcement rate (i.e., a VI+ schedule). In Experiment 1, rats responded faster on a variable ratio (VR) schedule than on a VI+ schedule with an equivalent feedback function. Reinforced interresponse times (IRTs) were shorter on the VR as compared to the VI+ schedule. In Experiments 2 and 3, there was no systematic difference in response rates maintained by a VI+ schedule and a VI schedule yoked in terms of reinforcement rate. This was found both when the yoking procedure was between-subject (Experiment 2) and within-subject (Experiment 3). Mean reinforced IRTs were similar on both the VI+ and yoked VI schedules, but these values were more variable on the VI+ schedule. These results provided no evidence that rats are sensitive to the feedback function relating response rate to reinforcement rate on a VI+ schedule.
Collapse
Affiliation(s)
- Phil Reed
- Department of Psychology, University College London, United Kingdom.
| | | | | | | |
Collapse
|
19
|
Kirkpatrick K, Church RM. Tracking of the expected time to reinforcement in temporal conditioning procedures. Learn Behav 2003; 31:3-21. [PMID: 18450066 DOI: 10.3758/bf03195967] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2001] [Accepted: 09/13/2002] [Indexed: 11/08/2022]
Abstract
In one experiment, the rate and pattern of responding (head entry into the food cup) under different distributions of intervals between food deliveries were examined. Separate groups of rats received fixed-time (45, 90, 180, or 360 sec), random-time (45, 90, 180, or 360 sec), or tandem fixed-time (45 or 90 sec) random-time (45 or 90 sec) schedules of reinforcement. Schedule type affected the pattern of responding as a function of time, whereas mean interval duration affected the mean rate of responding. Responses occurred in bouts with characteristics that were invariant across conditions. Packet theory, which assumes that the momentary probability of bout occurrence is negatively related to the conditional expected time remaining until the next reinforcer, accurately predicted global and local measures of responding. The success of the model advances the prediction of multiple measures of responding across different types of time-based schedules.
Collapse
|
20
|
Shull RL, Gaynor ST, Grimes JA. Response rate viewed as engagement bouts: resistance to extinction. J Exp Anal Behav 2002; 77:211-31. [PMID: 12083677 PMCID: PMC1284858 DOI: 10.1901/jeab.2002.77-211] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Rats obtained food pellets by nose poking a lighted key, the illumination of which alternated every 50 s during a session between blinking and steady, signaling either a relatively rich (60 per hour) or relatively lean (15 per hour) rate of reinforcement. During one training condition, all the reinforcers in the presence of the rich-reinforcement signal were response dependent (i.e., a variable-interval schedule); during another condition only 25% were response dependent (i.e., a variable-time schedule operated concurrently with a variable-interval schedule). An extinction session followed each training block. For both kinds of training schedule, and consistent with prior results, response rate was more resistant to extinction in the presence of the rich-reinforcement signal than in the presence of the lean-reinforcement signal. Analysis of interresponse-time distributions from baseline showed that differential resistance to extinction was not related to baseline differences in the rate of initiating response bouts or in the length of bouts. Also, bout-initiation rate (like response rate) was most resistant to extinction in the presence of the rich-reinforcement signal. These results support the proposal of behavioral momentum theory (e.g., Nevin & Grace, 2000) that resistance to extinction in the presence of a discriminative stimulus is determined more by the stimulus-reinforcer (Pavlovian) than by the stimulus-response-reinforcer (operant) contingency.
Collapse
Affiliation(s)
- Richard L Shull
- Department of Psychology, University of North Carolina at Greensboro, 27402-6164, USA.
| | | | | |
Collapse
|
21
|
Abstract
Packet theory is based on the assumption that the momentary probability of producing a bout or packet of responding is controlled by the conditional expected time function. Bouts of head entry responses of rats into a food cup appear to have the same characteristics across a range of conditions. The conditional expected time function is the mean expected time remaining until the next food delivery as a function of time since an event such as food or stimulus onset. The conditional expected time function encodes mean interval duration as well as the distribution form so that both the mean response rate and form of responding in time can be predicted. Simulations of Packet theory produced accurate quantitative predictions of: (1) the effect of reinforcement density (mean food-food interval) and distribution form on responding; (2) scalar variance in fixed interval responding; (3) CS-US and intertrial interval effects on the strength of conditioning; and (4) the effect of the ratio of cycle:trial time on the strength of conditioning.
Collapse
|
22
|
Shull RL, Gaynor ST, Grimes JA. Response rate viewed as engagement bouts: effects of relative reinforcement and schedule type. J Exp Anal Behav 2001; 75:247-74. [PMID: 11453618 PMCID: PMC1284817 DOI: 10.1901/jeab.2001.75-247] [Citation(s) in RCA: 100] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The rate of a reinforced response is conceptualized as a composite of engagement bouts (visits) and responding during visits. Part I of this paper describes a method for estimating the rate of visit initiations and the average number of responses per visit from log survivor plots: the proportion) of interresponse times (IRTs) longer than some elapsed time (log scale) plotted as a function of elapsed time. In Part 2 the method is applied to IRT distributions from rats that obtained food pellets by nose poking a lighted key under various multiple schedules of reinforcement. As expected, total response rate increased as a function of (a) increasing the rate of reinforcement (i.e., variable-interval [VI] 4 min vs. VI 1 mi), (b) increasing the amount of the reinforcer (one food pellet vs. four pellets), (c) increasing the percentage of reinforcers that were contingent on nose poking (25% vs. 100%), and (d) requiring additional responses after the end of the VI schedule (i.e., adding a tandem variable-ratio [VR] 9 requirement). The first three of these variables (relative reinforcement) increased the visit-initiation rate. The tandem VR, in contrast, increased the number of responses per visit. Thus, variables that have similar effects on total response rate can be differentiated based on their effects on the componemts of response rate.
Collapse
Affiliation(s)
- R L Shull
- Department of Psychology, University of North Carolina at Greensboro, 27402-6164, USA.
| | | | | |
Collapse
|
23
|
Ribes-Iñesta E, Torres C. The spatial distribution of behavior under varying frequencies of temporally scheduled water delivery. J Exp Anal Behav 2000; 73:195-209. [PMID: 10784009 PMCID: PMC1284771 DOI: 10.1901/jeab.2000.73-195] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Two studies evaluated the effects of response-independent water deliveries on the location (on the floor of the experimental chamber) and position (height) of rats' behavior. In both experiments, fixed-time schedules delivered water in two dispensers that were located at opposite ends of the chamber. In Experiment 1, the two schedules provided complementary frequencies of water deliveries while the overall number of deliveries stayed constant. In Experiment 2, one of the schedules delivered water twice as frequently as the other; this proportion was kept constant while the overall density of water deliveries changed systematically. In both experiments, a single position (height) of behavior was dominant. Also, the percentage of time allocated to each dispenser was roughly proportional to the percentage of water deliveries associated with the dispensers. These data and additional considerations support the importance of examining the spatial properties and patterning of behavior.
Collapse
|
24
|
Richards JB, Sabol KE, Seiden LS. DRL interresponse-time distributions: quantification by peak deviation analysis. J Exp Anal Behav 1993; 60:361-85. [PMID: 8409824 PMCID: PMC1322182 DOI: 10.1901/jeab.1993.60-361] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Peak deviation analysis is a quantitative technique for characterizing interresponse-time distributions that result from training on differential-reinforcement-of-low-rate schedules of reinforcement. It compares each rat's obtained interresponse-time distribution to the corresponding negative exponential distribution that would have occurred if the rat had emitted the same number of responses randomly in time, at the same rate. The comparison of the obtained distributions with corresponding negative exponential distributions provides the basis for computing three standardized metrics (burst ratio, peak location, and peak area) that quantitatively characterize the profile of the obtained interresponse-time distributions. In Experiment 1 peak deviation analysis quantitatively described the difference between the interresponse-time distributions of rats trained on variable-interval 300-s and differential-reinforcement-of-low-rate 72-s schedules of reinforcement. In Experiment 2 peak deviation analysis differentiated between the effects of the psychomotor stimulant d-amphetamine, the anxiolytic compound chlordiazepoxide, and the antidepressant desipramine. The results suggest that peak deviation analysis of interresponse-time distributions may provide a useful behavioral assay system for characterizing the effects of drugs.
Collapse
Affiliation(s)
- J B Richards
- Department of Pharmacological and Physiological Sciences, University of Chicago, Illinois 60637
| | | | | |
Collapse
|
25
|
Shull RL. Mathematical Description of Operant Behavior: an Introduction. ACTA ACUST UNITED AC 1991. [DOI: 10.1016/b978-0-444-81251-3.50014-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
26
|
|
27
|
Nevin JA, Tota ME, Torquato RD, Shull RL. Alternative reinforcement increases resistance to change: Pavlovian or operant contingencies? J Exp Anal Behav 1990; 53:359-79. [PMID: 2341820 PMCID: PMC1322963 DOI: 10.1901/jeab.1990.53-359] [Citation(s) in RCA: 213] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Two multiple-schedule experiments with pigeons examined the effect of adding food reinforcement from an alternative source on the resistance of the reinforced response (target response) to the decremental effects of satiation and extinction. In Experiment 1, key pecks were reinforced by food in two components according to variable-interval schedules and, in some conditions, food was delivered according to variable-time schedules in one of the components. The rate of key pecking in a component was negatively related to the proportion of reinforcers from the alternative (variable-time) source. Resistance to satiation and extinction, in contrast, was positively related to the overall rate of reinforcement in the component. Experiment 2 was conceptually similar except that the alternative reinforcers were contingent on a specific concurrent response. Again, the rate of the target response varied as a function of its relative reinforcement, but its resistance to satiation and extinction varied directly with the overall rate of reinforcement in the component stimulus regardless of its relative reinforcement. Together the results of the two experiments suggest that the relative reinforcement of a response (the operant contingency) determines its rate, whereas the stimulus-reinforcement contingency (a Pavlovian contingency) determines its resistance to change.
Collapse
Affiliation(s)
- J A Nevin
- Department of Psychology, University of New Hampshire, Durham 03824
| | | | | | | |
Collapse
|
28
|
Aldiss M, Davison M. Sensitivity of time allocation to concurrent-schedule reinforcement. J Exp Anal Behav 1985; 44:79-88. [PMID: 16812427 PMCID: PMC1348162 DOI: 10.1901/jeab.1985.44-79] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Four pigeons were trained on concurrent variable-interval schedules programmed on a center response key, with access to those schedules controlled by responses on left or right side keys. Two procedures were used. In one, the pigeon was given limited access, in that each side-key response produced 3-s access to a center-key schedule, and in the other procedure, access was unlimited. Data were analyzed using the generalized matching law. Comparison of sensitivities to reinforcement of interchangeover time for both procedures showed them to be of similar magnitude. Response sensitivities were also similar in magnitude for both procedures. From the limited-access procedure a second time measure that was available, switched-in time, was relatively uncontaminated by time spent emitting behavior other than key pecking. Sensitivities to reinforcement for the switched-in time measure were always smaller than interchangeover-time sensitivities for either procedure, and were approximately equal to response sensitivities for the limited-access procedure. Two other access times (5 and 7.5 s) were studied to validate the choice of 3 s as the main access time. These results indicate that when time spent emitting other behavior is excluded from interchangeover time, time and response sensitivities will be approximately equal.
Collapse
|