51
|
Abstract
Procedures classified as positive reinforcement are generally regarded as more desirable than those classified as aversive-those that involve negative reinforcement or punishment. This is a crude test of the desirability of a procedure to change or maintain behavior. The problems can be identified on the basis of theory, experimental analysis, and consideration of practical cases. Theoretically, the distinction between positive and negative reinforcement has proven difficult (some would say the distinction is untenable). When the distinction is made purely in operational terms, experiments reveal that positive reinforcement has aversive functions. On a practical level, positive reinforcement can lead to deleterious effects, and it is implicated in a range of personal and societal problems. These issues challenge us to identify other criteria for judging behavioral procedures.
Collapse
|
52
|
|
53
|
Baron A, Perone M, Galizio M. Analyzing the reinforcement process at the human level: can application and behavioristic interpretation replace laboratory research? THE BEHAVIOR ANALYST 2012; 14:95-105. [PMID: 22478086 DOI: 10.1007/bf03392557] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Critics have questioned the value of human operant conditioning experiments in the study of fundamental processes of reinforcement. Contradictory results from human and animal experiments have been attributed to the complex social and verbal history of the human subject. On these grounds, it has been contended that procedures that mimic those conventionally used with animal subjects represent a "poor analytic preparation" for the explication of reinforcement principles. In defending the use of conventional operant methods for human research, we make three points: (a) Historical variables play a critical role in research on processes of reinforcement, regardless of whether the subjects are humans or animals. (b) Techniques are available for detecting, analyzing, and counteracting such historical and extra-experimental influences; these include long-term observations, steady state designs, and, when variables are not amenable to direct control (e.g., age, gender, species), selection of subjects with common characteristics. (c) Other forms of evidence that might be used to validate conditioning principles-applied behavior analysis and behavioristic interpretation-have inherent limitations and cannot substitute for experimental analysis. We conclude that human operant conditioning experiments are essential for the analysis of the reinforcement process at the human level, but caution that their value depends on the extent to which the traditional methods of the experimental analysis of behavior are properly applied.
Collapse
Affiliation(s)
- A Baron
- Department of Psychology, University of Wisconsin-Milwaukee, Milwaukee, WI, USA
| | | | | |
Collapse
|
54
|
Abstract
In his effort to distinguish operant from respondent conditioning, Skinner stressed the lack of an eliciting stimulus and rejected the prevailing stereotype of Pavlovian "stimulus-response" psychology. But control by antecedent stimuli, whether classified as conditional or discriminative, is ubiquitous in the natural setting. With both respondent and operant behavior, symmetrical gradients of generalization along unrelated dimensions may be obtained following differential reinforcement in the presence and the absence of the stimulus. The slopes of these gradients serve as measures of stimulus control, and they can be steepened without applying differential reinforcement to any two points along the test dimension. Increases and decreases in stimulus control occur under the same conditions as those leading to increases and decreases in observing responses, indicating that it is the increasing frequency and duration of observation (and perhaps also of attention) that produces the separation in performances during discrimination learning.
Collapse
Affiliation(s)
- J A Dinsmoor
- Departmentof Psychology, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
55
|
Abstract
The second part of my tutorial stresses the systematic importance of two parameters of discrimination training: (a) the magnitude of the physical difference between the positive and the negative stimulus (disparity) and (b) the magnitude of the difference between the positive stimulus, in particular, and the background stimulation (salience). It then examines the role these variables play in such complex phenomena as blocking and overshadowing, progressive discrimination training, and the transfer of control by fading. It concludes by considering concept formation and imitation, which are important forms of application, and recent work on equivalence relations.
Collapse
|
56
|
Abstract
Psychologists have long been intrigued with the rationales that underlie our decisions. Similarly, the concept of conditioned reinforcement has a venerable history, particularly in accounting for behavior not obviously maintained by primary reinforcers. The studies of choice and of conditioned reinforcement have often developed in lockstep. Many contemporary approaches to these fundamental topics share an emphasis on context and on relative value. We trace the evolution of thinking about the potency of conditioned reinforcers from stimuli that were thought to acquire their value from pairings with more fundamental reinforcers to stimuli that acquire their value by being differentially correlated with these more fundamental reinforcers. We discuss some seminal experiments (including several that have been underappreciated) and some ongoing data, all of which have propelled us to the conclusion that the strength of conditioned reinforcers is determined by their signaling a relative improvement in the organism's relation to reinforcement.
Collapse
|
57
|
Stagner JP, Laude JR, Zentall TR. Sub-optimal choice in pigeons does not depend on avoidance of the stimulus associated with the absence of reinforcement. LEARNING AND MOTIVATION 2011. [DOI: 10.1016/j.lmot.2011.09.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
58
|
Bromberg-Martin ES, Hikosaka O. Lateral habenula neurons signal errors in the prediction of reward information. Nat Neurosci 2011; 14:1209-16. [PMID: 21857659 PMCID: PMC3164948 DOI: 10.1038/nn.2902] [Citation(s) in RCA: 172] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 07/12/2011] [Indexed: 01/09/2023]
Abstract
Humans and animals have the ability to predict future events, which they cultivate by continuously searching their environment for sources of predictive information. However, little is known about the neural systems that motivate this behavior. We hypothesized that information-seeking is assigned value by the same circuits that support reward-seeking, such that neural signals encoding reward prediction errors (RPEs) include analogous information prediction errors (IPEs). To test this, we recorded from neurons in the lateral habenula, a nucleus that encodes RPEs, while monkeys chose between cues that provided different chances to view information about upcoming rewards. We found that a subpopulation of lateral habenula neurons transmitted signals resembling IPEs, responding when reward information was unexpectedly cued, delivered or denied. These signals evaluated information sources reliably, even when the monkey's decisions did not. These neurons could provide a common instructive signal for reward-seeking and information-seeking behavior.
Collapse
Affiliation(s)
- Ethan S Bromberg-Martin
- Laboratory of Sensorimotor Research, National Eye Institute, US National Institutes of Health, Bethesda, Maryland, USA.
| | | |
Collapse
|
59
|
Zentall TR, Stagner JP. Sub-Optimal Choice by Pigeons: Failure to Support The Allais Paradox. LEARNING AND MOTIVATION 2011; 42:245-254. [PMID: 21852887 DOI: 10.1016/j.lmot.2011.03.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Pigeons show a preference for an alternative that provides them with discriminative stimuli (sometimes a stimulus that predicts reinforcement and at other times a stimulus that predicts the absence of reinforcement) over an alternative that provides them with non discriminative stimuli, even if the non discriminative stimulus alternative is associated with 2.5 times as much reinforcement (Stagner & Zentall, 1910). In Experiment 1 we found that the delay to reinforcement associated with the non discriminative stimuli could be reduced by almost one half before the pigeons were indifferent between the two alternatives. In Experiment 2 we tested the hypothesis that the preference for the discriminative stimulus alternative resulted from the fact that, like humans, the pigeons were attracted by the stimulus that consistently predicted reinforcement (the Allais paradox). When the probability of reinforcement associated with the discriminative stimulus that predicted reinforcement was reduced from 100% to 80% the pigeons still showed a strong preference for the discriminative stimulus alternative. Thus, under these conditions, the Allais paradox cannot account for the sub-optimal choice behavior shown by pigeons. Instead we propose that sub-optimal choice results from positive contrast between the low expectation of reinforcement associated with the discriminative stimulus alternative and the much higher obtained reinforcement when the stimulus associated with reinforcement appears. We propose that similar processes can account for sub-optimal gambling behavior by humans.
Collapse
|
60
|
Abstract
Contrary to the law of effect and optimal foraging theory, pigeons show suboptimal choice behavior by choosing an alternative that provides 20% reinforcement over another that provides 50% reinforcement. They choose the 20% reinforcement alternative--in which 20% of the time, that choice results in a stimulus that always predicts reinforcement, and 80% of the time, it results in another stimulus that predicts its absence--rather than the 50% reinforcement alternative, which results in one of two stimuli, each of which predicts reinforcement 50% of the time. This choice behavior may be related to suboptimal human monetary gambling behavior, because in both cases, the organism overemphasizes the infrequent occurrence of the winning event and underemphasizes the more frequent occurrence of the losing event.
Collapse
|
61
|
Fantino E, Silberberg A. Revisiting the role of bad news in maintaining human observing behavior. J Exp Anal Behav 2011; 93:157-70. [PMID: 20885808 DOI: 10.1901/jeab.2010.93-157] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2009] [Accepted: 11/02/2009] [Indexed: 11/22/2022]
Abstract
Results from studies of observing responses have suggested that stimuli maintain observing owing to their special relationship to primary reinforcement (the conditioned-reinforcement hypothesis), and not because they predict the availability and nonavailability of reinforcement (the information hypothesis). The present article first reviews a study that challenges that conclusion and then reports a series of five brief experiments that provide further support for the conditioned-reinforcement view. In Experiments 1 through 3, participants preferred occasional good news (a stimulus correlated with reinforcement) or no news (a stimulus uncorrelated with reinforcement) to occasional bad news (a stimulus negatively correlated with reinforcement). In Experiment 4 bad news was preferred to no news when the absence of stimulus change following a response to the bad-news option was reliably associated with good news. When this association was weakened in Experiment 5 the results were intermediate. The results support the conclusion that information is reinforcing only when it is positive or useful. As required by the conditioned-reinforcement hypothesis, useless information does not maintain observing.
Collapse
Affiliation(s)
- Edmund Fantino
- Department of Psychology-0109, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0109, USA.
| | | |
Collapse
|
62
|
Zentall TR. Maladaptive "gambling" by pigeons. Behav Processes 2011; 87:50-6. [PMID: 21215301 DOI: 10.1016/j.beproc.2010.12.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Revised: 12/17/2010] [Accepted: 12/27/2010] [Indexed: 11/30/2022]
Abstract
When humans buy a lottery ticket or gamble at a casino they are engaging in an activity that on average leads to a loss of money. Although animals are purported to engage in optimal foraging behavior, similar sub-optimal behavior can be found in pigeons. They show a preference for an alternative that is associated with a low probability of reinforcement (e.g., one that is followed by a red hue on 20% of the trials and then reinforcement or by a green hue on 80% of the trials and then the absence of reinforcement) over an alternative that is associated with a higher probability of reinforcement (e.g., blue or yellow each of which is followed by reinforcement 50% of the time). This effect appears to result from the strong conditioned reinforcement associated with the stimulus that is always followed by reinforcement. Surprisingly, although it is experienced four times as much, the stimulus that is never followed by reinforcement does not appear to result in significant conditioned inhibition (perhaps due to the absence of observing behavior). Similarly, human gamblers tend to overvalue wins and undervalue losses. Thus, this animal model may provide a useful analog to human gambling behavior, one that is free from the influence of human culture, language, social reinforcement, and other experiential biases that may influence human gambling behavior.
Collapse
Affiliation(s)
- Thomas R Zentall
- Department of Psychology, University of Kentucky, Lexington, KY 40506-0044, United States.
| |
Collapse
|
63
|
Zentall TR, Stagner J. Maladaptive choice behaviour by pigeons: an animal analogue and possible mechanism for gambling (sub-optimal human decision-making behaviour). Proc Biol Sci 2010; 278:1203-8. [PMID: 20943686 DOI: 10.1098/rspb.2010.1607] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Consistent with human gambling behaviour but contrary to optimal foraging theory, pigeons showed maladaptive choice behaviour in experiment 1 by choosing an alternative that provided on average two food pellets over an alternative that provided a certain three food pellets. On 20 per cent of the trials, choice of the two-pellet alternative resulted in a stimulus that always predicted ten food pellets; on the remaining 80 per cent of the trials, the two-pellet alternative resulted in a different stimulus that always predicted zero food pellets. Choice of the three-pellet alternative always resulted in three food pellets. This choice behaviour mimics human monetary gambling in which the infrequent occurrence of a stimulus signalling the winning event (10 pellets) is overemphasized and the more frequent occurrence of a stimulus signalling the losing event (zero pellets) is underemphasized, compared with the certain outcome associated with not gambling (the signal for three pellets). In experiment 2, choice of the two-pellet alternative resulted in ten pellets with a probability of 20 per cent following presentation of either stimulus. Choice of the three-pellet alternative continued to result in three food pellets. In this case, the pigeons reliably chose the alternative that provided a certain three pellets over the alternative that provided an average of two pellets. Thus, in experiment 1, the pigeons were responding to obtain the discriminative stimuli signalling reinforcement and the absence of reinforcement, rather than to obtain the variability in reinforcement.
Collapse
Affiliation(s)
- Thomas R Zentall
- Department of Psychology, University of Kentucky, Lexington, KY 40506-0044, USA.
| | | |
Collapse
|
64
|
Beierholm UR, Dayan P. Pavlovian-instrumental interaction in 'observing behavior'. PLoS Comput Biol 2010; 6. [PMID: 20838580 PMCID: PMC2936515 DOI: 10.1371/journal.pcbi.1000903] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Accepted: 07/26/2010] [Indexed: 11/19/2022] Open
Abstract
Subjects typically choose to be presented with stimuli that predict the existence of future reinforcements. This so-called ‘observing behavior’ is evident in many species under various experimental conditions, including if the choice is expensive, or if there is nothing that subjects can do to improve their lot with the information gained. A recent study showed that the activities of putative midbrain dopamine neurons reflect this preference for observation in a way that appears to challenge the common prediction-error interpretation of these neurons. In this paper, we provide an alternative account according to which observing behavior arises from a small, possibly Pavlovian, bias associated with the operation of working memory. The theory of Reinforcement Learning (RL) has been influential in explaining basic learning and behavior in humans and other animals, and in accounting for key features of the activity of dopamine neurons. However, perhaps due to this very success, paradigms that challenge RL are at a premium. One case concerns so-called ‘observing behavior’, in which, at least in some versions, animals elect to observe cues that are predictive of future rewarding outcomes, although the observations themselves have no direct behavioral relevance. In a recent experiment on observing, the activity of monkey dopaminergic neurons was also found to be incompatible with classic RL. However, as is often the case, this was a task that allowed for potential interactions from a secondary behavioral system in which responses are directly triggered by values. In this paper we show that a model incorporating a next order of refinement associated with such Pavlovian interactions can explain this type of observing behavior.
Collapse
Affiliation(s)
- Ulrik R Beierholm
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom.
| | | |
Collapse
|
65
|
Escobar R, Bruner CA. Observing responses and serial stimuli: searching for the reinforcing properties of the S-. J Exp Anal Behav 2010; 92:215-31. [PMID: 20354600 DOI: 10.1901/jeab.2009.92-215] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2007] [Accepted: 05/01/2009] [Indexed: 10/19/2022]
Abstract
The control exerted by a stimulus associated with an extinction component (S-) on observing responses was determined as a function of its temporal relation with the onset of the reinforcement component. Lever pressing by rats was reinforced on a mixed random-interval extinction schedule. Each press on a second lever produced stimuli associated with the component of the schedule in effect. In Experiment 1 a response-dependent clock procedure that incorporated different stimuli associated with an extinction component of a variable duration was used. When a single S- was presented throughout the extinction component, the rate of observing remained relatively constant across this component. In the response-dependent clock procedure, observing responses increased from the beginning to the end of the extinction component. This result was replicated in Experiment 2, using a similar clock procedure but keeping the number of stimuli per extinction component constant. We conclude that the S- can function as a conditioned reinforcer, a neutral stimulus or as an aversive stimulus, depending on its temporal location within the extinction component.
Collapse
|
66
|
|
67
|
Abstract
Seventeen pigeons were exposed to a three-key discrete-trial procedure in which a peck on the lit center key produced food if, and only if, the left keylight was lit. The center key was illuminated by a peck on the lit right key. Of interest was whether subjects pecked the right key before or after the response-independent onset of the left keylight. Pecks on the right key after left-keylight onset suggest control of behavior by the left keylight-an establishing stimulus. In three experiments, the strength of center-keylight onset as conditioned reinforcer for a response on the right key was manipulated by altering the size of the reduction in time to food delivery correlated with its onset. Control of pigeons' key pecks by onset of the left keylight occurred on more trials per session when the center keylight was a relatively weak conditioned reinforcer and on fewer trials per session when the center keylight was a relatively strong condtioned reinforcer. Differences across conditions in the degree of control by onset of the establishing stimulus were greatest when changes in conditioned reinforcer strength occurred relatively frequently and were signaled. The results provide evidence of the function of an establishing stimulus.
Collapse
|
68
|
Allen KD, Lattal KA. On conditioned reinforcing effects of negative discriminative stimuli. J Exp Anal Behav 2010; 52:335-9. [PMID: 16812600 PMCID: PMC1339185 DOI: 10.1901/jeab.1989.52-335] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Observing responses by pigeons were studied during sessions in which a food key and an observing key were available continuously. A variable-interval schedule and extinction alternated randomly on the food key. In one condition, food-key pecking during extinction decreased reinforcement frequency during the next variable-interval component, and in the other condition such pecking did not affect reinforcement frequency. Observing responses either changed both keylight colors from white to green (S+) or to red (S-) depending on the condition on the food key, or the observing responses never produced the S+ but produced the S- when extinction was in effect on the food key. Observing responses that produced only S- were maintained only when food-key pecking during extinction decreased reinforcement frequency in the subsequent variable-interval component. The red light conformed to conventional definitions of a negative discriminative stimulus, rendering results counter to previous findings that production of S- alone does not maintain observing. Rather than offering support for an informational account of conditioned reinforcement, the results are discussed in terms of a molar analysis to account for how stimuli acquire response-maintaining properties.
Collapse
|
69
|
Shull RL, Mellon RC, Sharp JA. Delay and number of food reinforcers: Effects on choice and latencies. J Exp Anal Behav 2010; 53:235-46. [PMID: 16812609 PMCID: PMC1323009 DOI: 10.1901/jeab.1990.53-235] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Pigeons were given a choice between two identical-duration situations (terminal links of chain schedules). One terminal link of the choice pair provided two food deliveries, and the other provided five. The exact times of these food deliveries differed between the terminal links and were varied over conditions. A single response during the initial link gave immediate access to the corresponding terminal link. Forced trials, during which only one of the initial-link keys was lighted, were interspersed with choice trials during which both initial-link keys were lighted. Choice tended to favor whichever terminal link was correlated with the higher sum of the immediacies (i.e., the sum of the reciprocals of the delays to each of the reinforcers following the choice, with all delays measured from the choice). Latencies on forced trials and on choice trials also were related (negatively) to the sum of the immediacies. This correlation among response measures (choice and latencies) suggests that both measures are manifestations of the effect of conditioned reinforcement on response tendencies.
Collapse
|
70
|
Perone M, Kaminski BJ. Conditioned reinforcement of human observing behavior by descriptive and arbitrary verbal stimuli. J Exp Anal Behav 2010; 58:557-75. [PMID: 16812679 PMCID: PMC1322102 DOI: 10.1901/jeab.1992.58-557] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
College students earned monetary reinforcers by pressing a key according to a compound schedule with variable-interval and extinction components. Pressing additional keys occasionally produced displays of either of two verbal stimuli; one was uncorrelated with the schedule components, and the other was correlated with the extinction component. In Experiments 1 and 2, the display area of the apparatus was blank unless an observing key was pressed, whereupon a descriptive message appeared. Most students preferred an uncorrelated stimulus stating that "Some of this time scores are TWICE AS LIKELY as normal, and some of this time NO SCORES can be earned" over a stimulus stating that "At this time NO SCORES can be earned." In Experiment 3, the display area indicated that "The Current Status of the Program is: NOT SHOWN." Presses on the observing keys replaced this message with stimuli that provided arbitrary labels for the schedule conditions. All of the students preferred a stimulus stating that "The Current Status of the Program is: B" over an uncorrelated stimulus stating that "The Current Status of the Program is: either A or B." Thus, under some circumstances, observing was maintained by a stimulus correlated with extinction-a finding that poses a challenge for Pavolvian accounts of conditioned reinforcement. Differences in the maintenance of observing by the descriptive and arbitrary stimuli may be attributed to differences in either the strength or nature of the instructional control exerted by the verbal stimuli.
Collapse
|
71
|
Dinsmoor JA, Bowe CA, Green L, Hanson J. Information on response requirements compared with information on food density as a reinforcer of observing in pigeons. J Exp Anal Behav 2010; 49:229-37. [PMID: 16812538 PMCID: PMC1338809 DOI: 10.1901/jeab.1988.49-229] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
On a variable-interval schedule, pecking the key to the pigeon's right (observing response) produced red or green displays relating to the delivery of grain and its dependence on pecking the key to the left (food key). During various blocks of sessions, mixed (no stimulus change) schedules including the following pairs of components were temporarily converted by the observing response to their corresponding multiple (correlated stimuli) schedules: variable-interval 60-s, extinction; variable-interval 60-s, variable-time (response-independent) 60-s; extinction, variable-time 60-s. Differences in food delivery maintained substantial rates of responding on the observing key, without regard to pecking requirements on the food key. Although stimuli correlated with differences in the response requirement on the food key maintained higher observing rates than those maintained by uncorrelated stimuli, they were much lower than those based on food. The value of predictive stimuli as reinforcers is determined by the value of the events predicted. In particular, the cost of pecking appears to be low, and this may place limitations on the applicability of energy-based and economic models of behavior.
Collapse
|
72
|
Critchfield TS, Perone M. Verbal self-reports of delayed matching to sample by humans. J Exp Anal Behav 2010; 53:321-44. [PMID: 16812611 PMCID: PMC1322961 DOI: 10.1901/jeab.1990.53-321] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Undergraduates participated in two experiments to develop methods for the experimental analysis of self-reports about behavior. The target behavior was the choice response in a delayed-matching-to-sample task in which monetary reinforcement was contingent upon both speed and accuracy of the choice. In Experiment 1, the temporal portion of the contingency was manipulated within each session, and the presence and absence of feedback about reinforcement was manipulated across sessions. As the time limits became stricter, target response speeds increased, but accuracy and reinforcement rates decreased. When feedback was withheld, further reductions in speed and reinforcement occurred, but only at the strictest time limit. Thus, the procedures were successful in producing systematic variation in the speed, accuracy, and reinforcement of the target behavior. Experiment 2 was designed to assess the influence of these characteristics on self-reports. In self-report conditions, each target response was followed by a computer-generated query: "Did you earn points?" The subject reported by pressing "Yes" or "No" buttons, with the sole consequence of advancing the session. In some cases, feedback about reinforcement of the target response followed the reports; in other cases it was withheld. Self-reports were less accurate when the target responses occurred under greater time pressure. When feedback was withheld, the speed of the target response influenced reports, in that the probability of a "Yes" report increased directly with the speed of accurate target responses. In addition, imposing the self-report procedure disrupted target performance by reducing response speeds at the strictest time limit. These results allow investigation of issues in both behavioral and cognitive psychology. More important, the overall order in the data suggests promise for the experimental analysis of self-reports by human subjects.
Collapse
|
73
|
Mace FC. Basic research needed for stimulating the development of behavioral technologies. J Exp Anal Behav 2010; 61:529-50. [PMID: 16812734 PMCID: PMC1334438 DOI: 10.1901/jeab.1994.61-529] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The costs of disconnection between the basic and applied sectors of behavior analysis are reviewed, and some solutions to these problems are proposed. Central to these solutions are collaborations between basic and applied behavioral scientists in programmatic research that addresses the behavioral basis and solution of human behavior problems. This kind of collaboration parallels the deliberate interactions between basic and applied researchers that have proven to be so profitable in other scientific fields, such as medicine. Basic research questions of particular relevance to the development of behavioral technologies are posed in the following areas: response allocation, resistance to change, countercontrol, formation and differentiation/discrimination of stimulus and response classes, analysis of low-rate behavior, and rule-governed behavior. Three interrelated strategies to build connections between the basic and applied analysis of behavior are identified: (a) the development of nonhuman animal models of human behavior problems using operations that parallel plausible human circumstances, (b) replication of the modeled relations with human subjects in the operant laboratory, and (c) tests of the generality of the model with actual human problems in natural settings.
Collapse
|
74
|
Abstract
If the functional relations governing the strength of a conditioned reinforcer correspond to those obtained with other Pavlovian procedures (e.g., Kaplan, 1984), the termination of stimuli appearing early in the interval between successive food deliveries should be reinforcing. During initial training we presented four key colors, followed by food, in a recurrent sequence to each of 6 pigeons. This established a baseline level of autoshaped pecking. In later sessions, we terminated each of these colors or only the first color for a brief period following each peck, replacing the original color with a standard substitute to avoid darkening the key. Pecking decreased in the presence of the last color in the sequence but increased in the presence of the first. In accord with contemporary models of Pavlovian conditioning, these and other data suggest that the behavioral effects of stimuli in a chain may be better understood in terms of what each stimulus predicts, as measured by relative time to the terminal reinforcer, than in the exclusively positive terms of the traditional formulation (Skinner, 1938). The same model may also account for the initial pause under fixed-interval and fixed-ratio schedules of reinforcement.
Collapse
|
75
|
Mueller KL, Dinsmoor JA. The effect of negative stimulus presentations on observing-response rates. J Exp Anal Behav 2010; 46:281-91. [PMID: 16812463 PMCID: PMC1348267 DOI: 10.1901/jeab.1986.46-281] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Theories of observing differ in predicting whether or not a signal for absence of reinforcement (S-) is capable of reinforcing observing responses. Experiments in which S- was first removed from and then restored to the procedure have yielded mixed results. The present experiments suggest that failure to control for the direct effect of presenting S- may have been responsible. Pigeons and operant procedures were used. Experiment 1 showed that presentations of S-, even when not contingent on observing, can raise the rate of an observing response that was reinforced only by presentations of a signal (S+) that accompanied a schedule of food delivery. Experiment 2 showed that this effect resulted from bursts of responding that followed offsets of S-. Experiment 3 showed that, when the presence of S- was held constant, lower rates occurred when S- was dependent on, rather than independent of, observing. These results support theories that characterize S- as incapable of reinforcing observing responses.
Collapse
|
76
|
Silberberg A, Fantino E. Observing responses: maintained by good news only? Behav Processes 2010; 85:80-2. [PMID: 20542098 DOI: 10.1016/j.beproc.2010.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2010] [Revised: 05/24/2010] [Accepted: 06/02/2010] [Indexed: 11/25/2022]
Abstract
Observing responses are those that produce stimuli correlated with the availability (S+) or non-availability (S-) of reinforcement but that has no influence on the actual delivery or timing of reinforcement. Prior research has shown that observing is maintained by the occasional production of the S+ ("good news") and not by production of the equally informative S- ("bad news"). However, for both humans and rats the S- maintains observing when it is at least implicitly correlated with good news. In the present study, pigeons could obtain both good and bad news by responding during the appropriate key color. In one condition, the bad news was actually more informative about reinforcement than was the good news. Nevertheless, a preponderance of the birds' responses was made on the nominally good-news option. The present results offer further support for the central role of good news in maintaining observing responses and are entirely consistent with the traditional conditioned-reinforcement (or classical conditioning) interpretation of observing.
Collapse
|
77
|
Shahan TA. Conditioned reinforcement and response strength. J Exp Anal Behav 2010; 93:269-89. [PMID: 20885815 PMCID: PMC2831656 DOI: 10.1901/jeab.2010.93-269] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Accepted: 11/06/2009] [Indexed: 10/19/2022]
Abstract
Stimuli associated with primary reinforcers appear themselves to acquire the capacity to strengthen behavior. This paper reviews research on the strengthening effects of conditioned reinforcers within the context of contemporary quantitative choice theories and behavioral momentum theory. Based partially on the finding that variations in parameters of conditioned reinforcement appear not to affect response strength as measured by resistance to change, long-standing assertions that conditioned reinforcers do not strengthen behavior in a reinforcement-like fashion are considered. A signposts or means-to-an-end account is explored and appears to provide a plausible alternative interpretation of the effects of stimuli associated with primary reinforcers. Related suggestions that primary reinforcers also might not have their effects via a strengthening process are explored and found to be worthy of serious consideration.
Collapse
Affiliation(s)
- Timothy A Shahan
- Department of Psychology, 2810 Old Main Hill, Utah State University, Logan, UT 84322, USA.
| |
Collapse
|
78
|
|
79
|
|
80
|
|
81
|
|
82
|
|
83
|
|
84
|
|
85
|
|
86
|
|
87
|
|
88
|
|
89
|
Rationalist versus empirical approaches to observing and conditioned reinforcement: The (so-called) preference-for-signaled-shock. Behav Brain Sci 2010. [DOI: 10.1017/s0140525x00023086] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
90
|
|
91
|
|
92
|
|
93
|
|
94
|
|
95
|
Abstract
AbstractBehaving organisms are continually choosing. Recently the theoretical and empirical study of decision making by behavioral ecologists and experimental psychologists have converged in the area of foraging, particularly food acquisition. This convergence has raised the interdisciplinary question of whether principles that have emerged from the study of decision making in the operant conditioning laboratory are consistent with decision making in naturally occurring foraging. One such principle, the “parameter-free delay-reduction hypothesis, ” developed in studies of choice in the operant conditioning laboratory, states that the effectiveness of a stimulus as a reinforcer may be predicted most accurately by calculating the decrease in time to food presentation correlated with the onset of the stimulus, relative to the length of time to food presentation measured from the onset of the preceding stimulus. Since foraging involves choice, the delay-reduction hypothesis may be extended to predict aspects of foraging. We discuss the strategy of assessing parameters of foraging with operant laboratory analogues to foraging. We then compare the predictions of the delay-reduction hypothesis with those of optimal foraging theory, developed by behavioral ecologists, showing that, with two exceptions, the two positions make comparable predictions. The delay-reduction hypothesis is also compared to several contemporary pscyhological accounts of choice. Results from several of our experiments with pigeons, designed as operant conditioning simulations of foraging, have shown the following: The more time subjects spend searching for or traveling between potential food sources, the less selective they become, that is, the more likely they are to accept the less preferred outcome; increasing time spent procuring (“handling”) food increases selectivity; how often the preferred outcome is available has a greater effect on choice then how often the less preferred outcome is available; subjects maximize reinforcement whether it is the rate, amount, or probability of reinforcement that is varied; there are no significant differences between subjects performing under different types of deprivation (open vs. closed economies). These results are all consistent with the delay-reduction hypothesis. Moreover, they suggest that the technology of the operant conditioning laboratory may have fruitful application in the study of foraging, and, in doing so, they underscore the importance of an interdisciplinary approach to behavior.
Collapse
|
96
|
|
97
|
|
98
|
|
99
|
|
100
|
|