Animal Models of Drug Addiction

Additional related information may be found at:

Neuropsychopharmacology: The Fifth Generation of Progress

Back to Psychopharmacology - The Fourth Generation of Progress

George F. Koob

DEFINITIONS AND VALIDATION OF ANIMAL MODELS
Definitions of Drug Addiction

Two characteristics are common to definitions of dependence and addiction: a compulsion to take the drug with a loss of control in limiting intake and a withdrawal syndrome that results in physical as well as motivational signs of discomfort when the drug is removed. The concept of reinforcement or motivation is a crucial part of both of these characteristics. A reinforcer can be defined operationally as "any event that increases the probability of a response." This definition can also be used to signify a definition for reward, and the two words are often used interchangeably. However, reward often connotes some additional emotional value such as pleasure.

Most models and definitions of drug dependence also involve the development of tolerance and dependence, which appear to onset and decay with a similar time course. The concepts of tolerance and dependence are integral parts of the hypothesis that adaptive processes are initiated to counter the acute effects of a drug. These processes persist long after the drug has cleared from the brain, thus leaving opposing processes unopposed during abstinence. Such conceptualizations have been explored at all levels of drug-dependence research from the behavioral to the molecular (40). Motivational hypotheses, involving central nervous system "counter-adaptive changes" (70), have been generated that have particular relevance to dependence phenomena.

Multiple sources of reinforcement can be identified during the course of drug dependence. Based on Wikler's extensive work with opiate drugs and his innovative conceptualizations about dependence (70), the primary pharmacological effect of a drug was hypothesized to produce a direct effect through positive or negative reinforcement as a process (e.g., self-medication) and/or can produce an indirect motivational effect through drug-engendered dependence (relief from aversive abstinence signs). The secondary pharmacological effects of the drug can also have motivating properties. Again, direct effects can be obtained through conditioned reinforcement (e.g., pairing of previously neutral stimuli with acute reinforcing effects of drugs) or indirect effects through removal of the conditioned negative reinforcing effects of conditioned abstinence. Recently, attempts have been made to explore the neurobiological bases for both the acute positive reinforcing effects of drugs and also the negative reinforcing effects imparted by the dependent state (42) (see also Behavioral Techniques in Preclinical Neuropsychopharmacology Research, Intracellular Messenger Pathways as Mediators of Neural Plasticity, Genetic Strategies in Preclinical Substance Abuse Research, and Autism and Pervasive Developmental Disorders, Tic Disorders, Childhood Anxiety Disorders, Cocaine, Caffeine-A Drug of Abuse?, Chronic Amphetamine Use and Abuse, Pathophysiology of Tobacco Dependence, and Opioids).

Validation of Animal Models of Addiction

An animal model can be viewed as an experimental preparation developed for the purpose of studying phenomena found in humans. Animal models are constructed to study selected parts of human syndromes (48). Two criteria appear to be necessary and sufficient for validating an animal model: reliability and predictive validity (see Animal Models of Psychiatric Disorders). Where possible, the animal models discussed below are evaluated in terms of these two criteria.

ANIMAL MODELS FOR THE POSITIVE REINFORCING PROPERTIES OF DRUGS

In recent conceptualizations of drug reinforcement, the positive reinforcing properties of drugs have been thought to play an important role in drug dependence (70, 74). It is amply clear that animals and humans will readily self-administer drugs in the nondependent state and that drugs have powerful reinforcing properties in that animals will perform many different tasks to obtain drugs. The drugs that have positive reinforcing effects correspond well with the drugs that have high abuse potential in humans (7, 10, 35, 43, 59, 60) (see Table 1). Much earlier work focused on operant paradigms in primates; however, studies in the last few years have illustrated that many of these same paradigms can be utilized in rodent models, and the new work in rodent models has provided a major benefit to studies focusing on the neurobiology of addiction (40, 41).

Operant Intravenous Drug Self-Administration

Drugs of abuse are readily self-administered intravenously by animals, and, in general, drugs that are self-administered correspond to those that have high abuse potential (10, 60). Indeed, this relationship is so strong that intravenous drug self-administration is considered an animal model that is predictive of abuse potential (10) and has been suggested to be used as part of a battery for the preclinical assessment of the abuse liability of new agents (35).

Some typical patterns of cocaine self-administration in a rat maintained on a simple fixed-ratio schedule are shown in Fig. 1. Within the range of doses that maintain stable responding, animals increase their self-administration rate as the unit dose is decreased, apparently compensating for decreases in the unit dose. Conversely, animals reduce their self-administration rate as the unit dose is increased. Thus, pharmacological manipulations, which increase the self-administration rate on this fixed-ratio schedule resemble decreases in the unit dose, causing a shift to the right of the dose effect function (a decrease in the reinforcing potency of cocaine). As would be predicted by the unit dose–response model, low to moderate doses of dopamine receptor antagonists increase cocaine self-administration maintained on this schedule in a manner similar to decreasing the unit dose of cocaine (Fig. 1), suggesting that partial blockade of dopamine receptors by competitive antagonists reduces the reinforcing potency of cocaine. Conversely, dopamine agonists decrease cocaine self-administration in a manner similar to increasing the unit dose of cocaine, suggesting that the effects of dopamine agonists together with cocaine self-administration can be addictive, perhaps due to their mutual activation of the same neural substrates (5).

Schedules of Reinforcement

The use of different schedules of reinforcement in intravenous self-administration can provide important control manipulations for nonspecific motor and motivational actions. For example, fixed-interval schedules of self-administration can be designed to measure response rate independently of frequency of reinforcement. An extended discussion of these schedules of reinforcement can be found elsewhere (25, 30, 36, 37, 74); see also Behavioral Techniques in Preclinical Neuropsychopharmacology Research).

Simple Schedules

In a simple fixed-ratio schedule, the number of responses required for an infusion of drug is set at a fixed number. In rats, these fixed-ratio schedules will generally not maintain stable responding below a certain unit dose, and, within the range of doses that do maintain stable responding, the self-administration rate is inversely related to dose.

Second-order Schedules

Second-order schedules are different from fixed-ratio schedules with regard to the unit dose–response function. In a second-order schedule, completion of an individual component (or part) of the schedule produces the terminal event (drug infusion) according to another overall schedule (25). In contrast to simple fixed-interval schedules, response rates in second-order schedules have been shown to increase with increasing drug doses (36, 37). Further increases in dose lead to a decrease or leveling off of response rates and a sigmoidal or inverted-U-shaped dose–response function. Dose–response relationships have been observed in dose preference measures in both animals and humans where higher doses are preferred over lower doses (17, 34). However, it should be noted that, although there is generally a good correlation between cocaine self-administration and its subjective "positive" and stimulant effects, there are areas where these measures are different (17). For example, humans will self-administer doses that do not produce subjective effects (17).

Such inverted-U-shaped unit dose–response functions are sensitive to pharmacological manipulations. In a recent study of cocaine self-administration in squirrel monkeys, a complete inverted-U-shaped unit dose–response function was entirely shifted to the right by dopamine receptor antagonists in some monkeys (2). This study elegantly demonstrated that dopamine receptor antagonists increase or decrease cocaine-maintained behavior depending upon the unit dose, but both alterations represent an attenuation of the effects of self-administered cocaine by these agents.

Multiple Schedules

A procedure controlling for nonselective effects of treatments on drug reinforcement is to incorporate self-administration into a multiple component schedule with other reinforcers. Behavior maintained by food or cocaine alternately in the same test session and with identical reinforcement requirements has been reported for various species (6). These schedules may be used to evaluate the selectivity of manipulations that apparently reduce the reinforcing efficacy of cocaine.

Progressive Ratio Schedules

Progressive ratio schedules have been used to evaluate the reinforcing efficacy of the self-administered drug by increasing the response requirements for each successive reinforcement and determining the breaking point, the point at which the animal will no longer respond (30, 56). A variety of evidence supports the hypothesis that this schedule is effective in determining the rank-order reinforcing effectiveness for different reinforcers including drugs. Increasing the unit dose of self-administered drugs increases the breaking point on a progressive ratio schedule (30, 56), and dopamine receptor antagonists have been shown to decrease the breaking point for cocaine self-administration (56).

Intravenous drug self-administration in animals has both reliability and predictive validity. The dependent variable is very reliable as a measure of the motivation to obtain drugs (the amount of work an animal will perform to obtain the drug) or, in an alternative framework, in demonstrating that drugs are powerful reinforcers. Where assessed with the appropriate operant schedules, the motivation or the reinforcing efficacy to obtain drugs changes with the type of drug, the dose, and the induction of drug dependence. Performance maintained by drugs as reinforcers is stable from session to session and can be altered predictably by drug antagonists.

Intravenous drug self-administration also has predictive validity, because drugs and doses having high reinforcement potential in animals are reported to have reinforcing effects in humans as measured by both operant and subjective reports (17, 35, 44, 60). The correspondence between subjective reports of euphoria and operant responding for drugs in humans is not perfect, but then subjective reports are clearly under different contingencies than drug-seeking behaviors (17, 44). Intravenous self-administration of a given drug is also predictive of abuse potential because of the high correspondence between the ability of various drugs to support operant responding for intravenous injection and their abuse by humans (10).

Brain Stimulation Reward

Electrical self-stimulation of certain brain areas is rewarding for animals and humans as demonstrated by the fact that subjects will readily self-administer the stimulation (52). The powerful nature of the reward effect produced by intracranial self-stimulation (ICSS) is indicated by the behavioral characteristics of the ICSS response, which include rapid learning and vigorous execution of the stimulation-producing behavior (for a review, see ref. 19). The high reward value of ICSS has led to the hypothesis that ICSS directly activates neuronal circuits that are activated by conventional reinforcers (for example, food, water, sex). In bypassing much of the input side of these neuronal circuit(s), ICSS provides a unique tool in neuropharmacological research to investigate the influence of various substances on reward and reinforcement processes. Intracranial self-stimulation differs significantly from drug self-administration in that, in this procedure, the animal is working to directly stimulate presumed reinforcement circuits in the brain and the effects of the drugs are assessed on these reward thresholds. Drugs of abuse decrease thresholds for ICSS, and there is a good correspondence between the ability of drugs to decrease ICSS thresholds and their abuse potential (43).

Many ICSS procedures have been developed over the years (for a review, see ref. 63), but an important methodological advance has been the development of procedures to provide a valid measure of reward threshold unconfounded by influences on motor and performance capability. Two ICSS procedures that have been used extensively to measure the changes in reward threshold produced by drugs are the rate-frequency, curve-shift procedure and the discrete-trial, current-intensity procedure (43, 47). These two procedures are widely used in ICSS research because they have been validated experimentally, but other valid ICSS procedures are available. (For a review of the procedures and a detailed description of the methodology employed in the rate-frequency and discrete-trial procedure, see ref. 47)

Rate–Frequency Procedure

The rate–frequency procedure involves the generation of a stimulation input–output function and provides a frequency threshold measure (19, 50). Rate–frequency curves are collected by allowing the rats to press a lever for an ascending series of pulse frequency stimuli, delivered through an electrode in the medial forebrain bundle or other rewarding brain site. A runway apparatus can also be used with running speed as the dependent measure (19). Frequencies can also be presented in a descending or random order or in alternating descending and ascending series and are changed in 0.05 or 0.1 log-unit steps. Two measures are obtained.

The locus of rise (LOR) refers to the "location" (that is, the frequency) at which the function rises from zero to an arbitrary criterion-of-performance level and is presumed to be a measure of ICSS reward threshold (19). The most frequently used criterion is 50% of maximal rate. The behavioral maximum (MAX) measure is the asymptotic maximal response rate, and changes in MAX are thought to reflect motor or performance effects.

Validation studies indicate that changes in the reward efficacy of the stimulation (i.e., intensity manipulations) shift the rate–frequency functions laterally, which translates into large changes in the LOR value but produce no alterations in the asymptote or in the shape of the function (19). In contrast, performance manipulations (for example, weight on the lever, curare, etc.), including changes in motivation (that is, priming), alter the MAX value and the shape of the function (19, 50). However, the effects of a manipulation on self-stimulation of LOR smaller than 0.2 log-units must be interpreted carefully because studies indicate that the effects of performance manipulations on the LOR can be as high as 0.2 log-units (19).

Discrete-trial Threshold Procedure

The discrete-trial procedure is a modification of the classical psychophysical method of limits and provides a current-intensity-threshold measure (43, 47). This procedure consists of a series of discrete trials in which the subject is expected to emit a single response to receive the electrical stimulus, the current intensity of which is varied between trials. At the start of each trial, rats receive a noncontingent, experimenter-administered electrical stimulus and then have 7.5 sec to turn a wheel manipulandum to obtain a contingent stimulus identical to the previously delivered noncontingent stimulus (positive reinforcer) (see refs. 43 and 47 for details). The threshold value is defined as the midpoint in microamperes between the current intensity level at which the animal makes two or more positive responses out of the three stimulus presentations and the level at which the animal makes less than two positive responses at two consecutive intensities. Response latency is defined as the time in seconds that elapses between the delivery of the noncontingent electrical stimulus (end of the stimulus) and the animal's response on the wheel.

Again, lowering the thresholds can be interpreted as an increase in the reward value of the stimulation, whereas increases in threshold reflect decreases in reward value. Increases in response latency can be interpreted as a motor or performance deficit, but decreases in response latencies in the procedure are difficult to induce, because response latencies are already very short (1.5 to 2.0 sec) under control conditions. In general, the discrete-trial threshold procedure is designed to minimize behavioral response requirements. Therefore, the procedure is not expected to be sensitive to manipulations that indicate motor and performance deficits (47).

Place Preference

Place preference or place conditioning is not an explicitly operant procedure that has been used for assessing the reinforcing efficacy of drugs using, in effect, a Pavlovian conditioning procedure. In a simple version of the place preference paradigm, animals experience two distinct neutral environments that are subsequently paired spatially and temporally with distinct drug states (the unconditioned stimuli, UCS). The animal is later given an opportunity to choose to enter and explore either environment, and the time spent in either environment is considered an index of the reinforcing value of the drug (the UCS). The animal's choice to spend more time in an environment is assumed to be an expression of the positive reinforcing experience within that environment. With a positive reinforcing UCS, the previously neutral stimuli become secondary positive reinforcers. Of course, the opposite can also occur, an aversive experience becomes a secondary negative reinforcer (see below). Perhaps one of the earliest demonstrations of place preference was the observation by Olds and Milner (52) that rats stimulated through an intracranial electrode would return to the location in which they received the stimulation.

Two-choice Procedures

The simple version of the place-conditioning paradigm involves allowing an animal to freely explore and experience two distinct environments for 10 to 20 min to obtain a pretest preference. Subsequently, the animals are restricted to one of the environments under the drug condition and the other environment under the nondrug or placebo condition. Subsequent posttraining tests are performed in the drug-free state, again with a choice of freely exploring both environments (for details see ref. 69).

There are a number of critical independent and dependent variables that can affect place conditioning dramatically. Dependent variables include the duration of the posttraining testing, the method for calculating preference (difference score, percentage of pretraining, etc.), and the actual measures used (number of entries, mean duration of time per entry). A critical independent variable is the use of a biased or unbiased training schedule. In the biased design, animals are first tested for their baseline preference, and the animals may show a significant preference for a given environment. Pairings are made during training with the least-preferred environment. Clearly, one could have, instead of a true place preference, simply a reversal of a place aversion. In the unbiased design, a manipulation of the stimuli comprising the environments is made such that there is no preference. Other independent variables are numerous and include housing of the animals, age of the animals, familiarity with the training environment, and physical structure of the training and testing environment. Detailed discussion of these issues is beyond the scope of this chapter (for more information see refs. 7, 65, 68 and 69).

Multiple-choice Procedures

The multiple-choice procedure simply adds additional environmental choices, such as three distinct environments (31), or multiple spatial locations, such as on an elevated radial maze (49). In either case, an additional choice allows for additional controls for nonspecific effects and permits easier balancing between two locations being used for subsequent pairings. (For details of two types of apparati and procedures, see refs. 49 and 66).

Drug Discrimination

Drug discrimination in animals is based on the hypothesis that the same components of a drug's action subserve discriminative stimulus effects in animals and subjective effects in humans (32). Even more importantly, the similarity of the subjective effects of a given drug to the subjective effects produced by a known drug of abuse such as amphetamine can predict abuse potential. Drug discrimination procedures developed in animals have provided a powerful tool for identifying the relative similarity of the discriminative stimulus effects of drugs and, by comparison with known drugs of abuse, the generation of hypotheses regarding the abuse potential of these drugs (32).

Drug discrimination typically involves training an animal to produce a particular response in a given drug state for a food reinforcer and to produce a different response in the placebo or drug-free state. The interoceptive cue state (produced by the drug) controls the behavior as a discriminative stimulus or cue that informs the animal to make the appropriate response in order to gain reinforcement. The choice of response that follows administration of an unknown test compound can provide valuable information about the similarity of that drug's interoceptive cue properties to those of the training drug.

Fixed-ratio Operant Procedures

Some of the original drug discrimination procedures utilized a T-shaped maze escape procedure (53). However, high drug doses are required and the T-shaped maze is not easily automated. More commonly, an appetitively motivated operant procedure is used where the rat has access to two levers (11). Responding on one lever (e.g., left lever) is reinforced on a fixed-ratio 10 schedule for food following injection of the training drug; the other lever is reinforced on a fixed-ratio 10 schedule for food in sessions that follow the injection of the drug vehicle (for details see ref. 32). Schedules of reinforcement other than fixed-ratios and species other than rats, such as rhesus and squirrel monkeys, are used in drug discrimination procedures. However, according to recent trends, rats and fixed-ratio schedules are most commonly employed (67).

Tests of generalization to a novel drug can be interspersed among the training sessions once performance has stabilized. Such tests are often conducted once or twice each week. Alternatively, an entire drug generalization function can be generated in a few hours of a single day using a cumulative dosing method where animals are tested in a series of short sessions with a time-out between each short session (3).

In tests of stimulus generalization, data are often collected only until the delivery of the first reinforcer, which eliminates the influence of reinforcement on subsequent choice responding (which would effectively place the animal in a new training situation). Alternatively, the sessions can be conducted as extinction sessions in which no reinforcers are delivered or can be conducted when responses on both levers are reinforced.

Discrete-trials Procedure

An alternative drug discrimination training procedure used extensively by Holtzman and colleagues in both rats and squirrel monkeys (32) involves a discrete-trials procedure using avoidance or escape from shock. Here, animals are trained to lever press on one of two levers to avoid or escape electric shocks that are delivered intermittently to the grid floor of the cage. A trial is signaled by the illumination of a house light. A third lever (called the observing lever) must be pressed before the choice is made to prevent the rat from perseverating on the appropriate choice lever. A major advantage of this aversively maintained responding is that no food restriction is necessary and there is no confound from the anorexic effects of the drug in question (32). Another advantage is that higher doses of the drug can be tested, which may reveal important aspects of the pharmacology of the test drugs (62).

Advantages and Disadvantages of Animal Models for the Positive Reinforcing Properties of Drugs

The advantages of intravenous self-administration and drug discrimination as animal models for the reinforcing effects of drugs are numerous. Drug self-administration has high sensitivity to low doses of drugs, it has potential utility in studying both the positive and negative reinforcing actions of drugs, drug reinforcement can be tested in drug-free conditions, and it allows precise control over the interaction of environmental cues with drug administration. As described above, both procedures have predictive validity and are reliable. Another major advantage of these procedures is that they lend themselves to within-subjects designs, limiting the number of subjects required. Indeed, once an animal is trained, full dose–effect functions can be generated for different drugs, and the animal can be tested for weeks and months. Pharmacological manipulations can be conducted with standard reference compounds to validate any effects. In addition, a rich literature on the experimental analysis of behavior is available for exploring the hypothetical constructs of drug action as well as for modifying drug reinforcement by modifying the history and contingencies of reinforcement.

The advantage of the ICSS paradigm as a model of drug effects on motivation and reward is that by directly stimulating the putative reward systems, one presumably bypasses the input side of the system and eliminates the nonspecific effects of consummatory behaviors, such as feeding, that can complicate data interpretation. Also, the behavioral threshold measure provided by ICSS procedures is easily quantifiable, because ICSS threshold estimates are very stable over periods of several months (for a review, see ref. 63). Another considerable advantage of the ICSS technique is the high reliability with which it predicts the abuse liability of drugs. For example, there has never been a false positive with the discrete-trials threshold technique (43).

The advantages of place conditioning as a model for evaluating drugs of abuse are similar to those of drug self-administration and include (a) a high sensitivity to low doses of drugs, (b) the potential utility in studying both sides of hedonic valence (e.g., both positive and negative reinforcement), (c) the fact that testing for drug reinforcement is done under drug-free conditions, and (d) the allowance for precise control over the interaction of environmental cues with drug administration (7, 69).

The disadvantages of intravenous self-administration and ICSS are largely technical, that is, the procedures require survival surgery as well as reasonably sophisticated testing apparati. Special skills and procedures are required to implement and maintain a chronic catheter preparation, and success in maintaining viable catheter preparations in rodent studies can be poor, particularly over periods of 6 weeks or more (for details of the technical issues and procedures, see ref. 6).

Disadvantages of drug discrimination are that subjects receive numerous doses of the training drug, and any neuropharmacological changes caused by such phenomenon as sensitization cannot be measured; in fact, they may be masked by the training and testing procedure. The other disadvantage is that the predictive validity of the procedure is indirect (e.g., the ability to predict abuse potential) and is based on knowledge of the class to which the previously unknown compound generalizes.

The major disadvantage of place conditioning is the enormous cost, effort, and time required to generate meaningful results. Each dose requires 8 to 12 rats in an independent (between-subjects) design, and each animal must be trained and tested numerous times yet yields only one independent data point. Contributing to these enormous costs are all the control experiments required to address issues such as state dependency, familiarity, hyperactivity, and biased initial preferences. Because only a limited number of animals can be trained at one time, even with automated apparati, the paradigm becomes time consuming as well.

ANIMAL MODELS OF THE NEGATIVE REINFORCING PROPERTIES OF DRUG WITHDRAWAL

Drug withdrawal from chronic drug administration is usually characterized by responses opposite to the acute initial actions of the drug. Many of the overt physical signs associated with withdrawal from drugs (e.g., alcohol and opiates) can be easily quantified. However, motivational measures of abstinence have proven to be more sensitive measures of drug withdrawal and powerful tools for exploring the neurobiological bases for the motivational aspects of drug dependence. Animal models for the motivational effects of drug withdrawal have included operant schedules, place aversion, ICSS, the elevated plus maze, and drug discrimination. Although some of these models may reflect more general malaise than others, each of these models can be considered to address a different hypothetical construct associated with a given motivational aspect of withdrawal.

Operant Drug Self-Administration in Drug-dependent Animals

Drug self-administration can easily be conducted in drug-dependent animals, and the procedures are very similar to those discussed above regarding drug self-administration in nondependent rats. However, some evidence suggests that the reinforcing efficacy of a drug can increase with dependence. Monkeys made dependent on morphine showed increases in their progressive ratio performance compared to their performance in the nondependent state (75). Also, baboons in a discrete-trials choice procedure for food and heroin showed significant behavioral elasticity when allowed access to heroin or food periodically in a nondependent state (15). In the dependent state, one would hypothesize that the animals would be much less likely to respond for food, even if the cost of heroin in terms of response requirements was dramatically increased. Thus, the reinforcing value of drugs may change with dependence. The neurobiological basis for such a change is only beginning to be investigated (42), but much evidence has been generated to show that drug dependence itself can produce an aversive or negative motivational state that is manifested by changes in a number of behavioral measures, such as response disruption, changes in reward thresholds, and place aversions.

Operant Schedules for Nondrug Reinforcers in Dependent Animals

Several operant schedules have been used to characterize the response–disruptive effects of drug withdrawal (13, 22, 33, 42), providing a readily quantifiable measure of withdrawal (e.g., response rate). These include high rate schedules, such as fixed ratios (22), and a stable, but low, steady-state rate of responding, such as the differential reinforcement of low rates of responding (DRL) (13). However, response disruption can be caused by any number of variables from motor problems to malaise and decreases in appetite, and thus other measures must be used to rule out nonspecific effects (see below).

Place Aversion

Place aversion has been used to measure the aversive stimulus effects of withdrawal (31, 42, 66). Here, in contrast to the place-preference conditioning discussed above, rats exposed to a particular environment while undergoing withdrawal will spend less time in the withdrawal-paired environment when subsequently presented with a choice between that environment and one and two possible unpaired environments. Naloxone itself will produce a place aversion in non-opiate-dependent rats, but the threshold dose required to produce a place aversion decreases significantly in dependent rats (31) (see Fig. 2). Hypothetically, identical studies could be performed in dependent rats undergoing spontaneous withdrawal. The challenge would be to find a discrete period of sufficient unconditioned stimulus value (aversive state) to be associated with a previously neutral conditioned stimulus.

Brain Stimulation Reward

Intracranial self-stimulation thresholds have been used to assess changes in systems mediating reward and reinforcement processes during the course of drug dependence. Although no actual negative reinforcement is measured using this technique, it is included in this section because it constitutes a model of the aversive motivational state associated with the negative reinforcement of drug abstinence in dependent animals. Acute administration of psychostimulant drugs lowers ICSS threshold (i.e., increases ICSS reward) (for reviews see refs. 43 and 63) and withdrawal from chronic administration of these same drugs elevates ICSS thresholds (i.e., decrease ICSS reward) (39, 45, 46) (see Fig. 3). Similar results have been observed with precipitated withdrawal in opiate-dependent rats (58). Rats trained in the discrete-trials threshold procedure showed dramatic increases in ICSS thresholds to naloxone injections that occurred in a dose-related manner and at doses below which obvious physical signs of opiate withdrawal were manifest. These doses of naloxone had no effect on reward thresholds in this dose range in nondependent animals.

Drug Discrimination

Drug discrimination can be used to characterize both specific and nonspecific aspects of withdrawal. Generalization to an opiate antagonist provides a more general nonspecific measure of opiate withdrawal intensity and time course (18, 21). Examples of a more specific aspect of withdrawal are animals that have been trained to discriminate pentylenetetrazol, an anxiogenic-like substance, from saline in ethanol-, diazepam-, and opiate-dependent animals. During withdrawal, generalization to the pentylenetetrazol cue has suggested an anxiogenic-like component to the withdrawal syndrome (16, 20).

Advantages and Disadvantages of Animal Models of the Negative Reinforcing Properties of Drugs

These motivational measures of drug withdrawal have most of the same advantages and disadvantages as do the positive reinforcing effects of drugs. To summarize, intravenous drug self-administration is a direct measure of the reinforcing effects of drugs, ICSS threshold procedures have high predictive validity for changes in reward valence, disruption of operant responding during drug abstinence is very sensitive, place aversion implies an aversive unconditioned stimulus, and drug discrimination allows a powerful and sensitive comparison to other drug states.

The disadvantages of each of these dependent variables are also similar to those described above for the positive reinforcing effects of drugs. Intravenous self-administration and ICSS have numerous technical challenges, disruption of operant responding is subject to nonspecific effects and is difficult to interpret in isolation, place aversion is costly because of the large number of subjects that are necessary for between-subject designs, and drug discrimination is weak on predictive validity. Clearly, each of these dependent variables in isolation has weaknesses, but when combined can provide a powerful insight into the motivational effects of drug abstinence.

ANIMAL MODELS OF THE CONDITIONED REINFORCING PROPERTIES OF DRUGS

Some of the earliest evidence for the ability of drug-paired stimuli to function as conditioned reinforcing stimuli was provided by a study in which an anise-flavored solution of etonitazene (an opiate agonist), was provided as the sole drinking solution to non-opiate-dependent rats (72). Several months later, in a two-bottle choice situation, the animals consumed twice as much anise-flavored water as control rats. Furthermore, opiate-dependent rats that were given access to the anise-flavored etonitazene solution during morphine withdrawal, several months later consumed twice as much anise-flavored water as the rats with similar anise-flavored etonitazene experience that were never opiate-dependent. These results suggest that previously neutral stimuli can acquire reinforcing properties when paired with reinforcing drugs and that the development of dependence and the amelioration of withdrawal can also contribute to the motivational efficacy of such secondary reinforcers.

Extinction with and without Cues Associated with Drug Self-Administration

Extinction procedures can provide measures of the incentive or motivational properties of drugs by assessing the persistence of drug-seeking behavior in the absence of response-contingent drug availability. Extinction testing sessions are identical to training sessions for drug self-administration, except that no drug is delivered after completion of the response requirement.

Measures provided by an extinction paradigm reflect the degree of resistance to extinction and include the duration of extinction responding and the total number of responses emitted during the entire extinction session. The probability of reinitiating responding under extinction conditions with drug-paired stimuli or even stimuli previously paired with drug withdrawal can be explored at a later time after successful extinction of the self-administrative behavior.

In this type of paradigm both stimulant and opiate self-administration have been consistently reinstated following extinction in animals with systemic or intracerebral noncontingent drug infusions (e.g., priming) (23, 64). Responding for a conditioned reinforcer contingency in rats that is extinguished can also be reinstated by noncontingent drug infusions (12). The effectiveness of compounds in reinstating drug self-administration decreases as their discriminative stimulus similarity to the training drug decreases (23, 64).

This specificity of drug priming is consistent with the specificity of conditioned responses in human drug users. An experimental study in human drug users indicated that cocaine-related stimuli were effective in eliciting conditioned physiological responses and self-reported cocaine craving in cocaine users, but not in opiate or nondrug users, but neutral stimuli were ineffective in eliciting any conditioned responses (14). In addition to general physiological responses, conditioned stimuli associated with drug administration also induce the psychological phenomenon of drug craving in humans, even after a period of abstinence (9).

Positive Reinforcing Properties of Cues Associated with Drug Self-Administration

A conditioned reinforcer can be defined as any neutral stimulus that acquires reinforcing properties through association with a primary reinforcer. In a conditioned reinforcement paradigm, subjects are usually trained in an operant box containing two levers by which responses on one lever result in presentation of a brief stimulus followed by a drug injection (active lever), whereas responses on the other lever have no consequences throughout the experiment (inactive lever) (12, 61). Previously neutral stimuli can also acquire conditioned reinforcing properties when the drug administration is not contingent on the animal's behavior, as long as the stimulus precedes the drug injection (12). Subsequently, the ability of the previously neutral, drug-paired stimuli to maintain responding in the absence of drug injections provide a measure of the reinforcing value of these stimuli. Psychomotor stimulants also potentiate conditioned reinforcement to nondrug reinforcers (55), and the neural substrate for these effects appears to be the release of dopamine to the nucleus accumbens (38).

Second-order schedules can also be used as a measure of the conditioned reinforcing properties of drugs. As described above, completion of the first component or unit of the schedule, rather than an individual response, produces the terminal event according to another overall schedule (25). In some versions of a second-order schedule, completion of the first component or unit produces a stimulus (2-sec light) and then completion of a fixed number of first components produces the light and drug.

Manipulation of the stimuli that are part of the second-order schedule can alter acquisition, maintenance, resistance to extinction, and recovery from extinction in second-order schedules (25). To assess the effects of conditioned reinforcement, the number of responses with the paired stimulus can be compared to the number of responses with a nonpaired stimulus. For example, substitution of drug-paired stimuli with non-drug-paired stimuli can actually decrease response rates (36). This maintenance of performance in second-order schedules with drug-paired stimuli appears to be analogous to the maintenance and reinstatement of drug seeking in humans with the presentation of drug-paired stimuli (9).

The conditioned place preference paradigm also provides a measure of conditioned reinforcement that is conceptually similar to the measures provided by the operant paradigms. Several extensive reviews have been written on the place preference paradigm (7, 65, 68, 69); also see above.

Conditioned Negative Reinforcing Effects of Withdrawal—Cues Conditioned to the Motivational Effects of Drug Abstinence

The motivation for maintenance of compulsive drug use requires more than just positive reinforcement; negative reinforcement also has been hypothesized to play a role in that drug self-administration relieves the aversive affects of abstinence from chronic drug administration. These motivational aspects of withdrawal can be conditioned, and conditioned withdrawal has been repeatedly observed in opiate-dependent animals and humans. Patients, even detoxified subjects, report signs and symptoms resembling opiate abstinence when returning to environments similar to those associated with drug experiences. In a more direct demonstration performed in opiate addicts maintained on methadone, naloxone injections were repeatedly paired with a compound stimulus of a tone and a peppermint smell (51). Following these pairings, presentation of only the tone and odor elicited both subjective reports of sickness and discomfort and objective physical signs of withdrawal.

In early animal studies, rats made dependent by gradually increasing daily doses of morphine were exposed to a distinct environment each evening while experiencing the gradual and progressive onset of morphine abstinence. After 6 weeks of such pairings, rats exposed to this same distinct environment showed somatic signs of withdrawal up to 155 days after the last morphine injection (70, 71).

In addition to somatic signs of withdrawal, motivational signs of withdrawal have also been conditioned in animals. Morphine-dependent rhesus monkeys, trained to lever press for food on a fixed-ratio 10 schedule, showed an immediate suppression of food-maintained responding in addition to clear physical signs of withdrawal after injection of the opiate-mixed agonist/antagonist nalorphine (27, 28, see also ref. 24). Presentation of the conditioned stimuli resulting from repeated pairings of a light or tone with the nalorphine injection also produced a complete suppression of this food responding (27, 28). Similar results have now been observed with rats (1). Four pairings of a compound stimulus of a tone and smell with an injection of naloxone in morphine-dependent rats trained to lever press for food on a fixed-ratio 15 schedule resulted in a reduction in operant responding in response to the tone and smell alone, and this conditioned response persisted for 1 month, even after pellet removal (see Fig. 4). The animals showed no obvious conditioned physical signs of withdrawal. These results suggest that a conditioned stimulus can acquire aversive stimulus effects that persist even in the absence of opiate occupancy of receptors and that motivational signs of conditioned withdrawal can occur in the absence of withdrawallike somatic symptoms.

The motivational significance of conditioned withdrawal was provided in a series of elegant operant studies by Goldberg and colleagues (26, 29). After repeated pairings of nalorphine and the light in monkeys self-administering morphine intravenously 24 hr/day, presentation of the light alone (with injection of saline) resulted in a conditioned increase in responding for morphine, presumably to avoid the onset of withdrawal (29). An even more compelling demonstration of the negative reinforcing properties of conditioned withdrawal was provided by a study in which opiate-dependent monkeys were given daily 2-hr sessions in which a green light signaled an intravenous infusion of nalorphine or naloxone (26). Lever-pressing by the monkey terminated the green light and prevented the injections of opiate antagonists for 60 sec. Initially, most responses occurred after the onset of injection, but with repeated pairings of the light cue and the antagonist, most of the responding occurred during the period when the light cue was illuminated, but before the antagonist infusion. These results are a powerful demonstration of the negative reinforcing properties of antagonist-precipitated drug withdrawal.

Advantages and Disadvantages of Animal Models of the Conditioned Reinforcing Properties of Drugs

The advantage of the extinction paradigm as a model for the conditioned reinforcing effects of drugs is that it can be a reliable indicator of the ability of conditioned stimuli to reinitiate drug-seeking behavior and thus have predictive validity as a measure of drug relapse. The conditioned-reinforcement paradigm has the advantage of assessing the reinforcing value of a drug infusion in the absence of acute effects of the self-administered drug that could influence performance or other processes that interfere with reinforcing functions. For example, nonspecific effects of manipulations administered before the stimulus drug pairings do not directly affect the assessment of the reinforcing value of the conditioned stimuli because the critical test can be conducted several days after the stimulus drug pairings. Also, the paradigm contains a built-in control for nonspecific motor effects of a manipulation by its assessment of the number of responses on an inactive lever.

One of the advantages of second-order schedules is that they reliably maintain high rates of responding in a variety of species (thousands of responses per session in monkeys) and extended sequences of behavior before any drug administration (25, 36). Thus, potentially disruptive nonspecific acute drug and treatment effects on response rates can be minimized. High response rates can be maintained even for doses that decrease rates during a session on a regular fixed-ratio schedule, indicating that performance on the second-order schedule can be less affected by those acute effects of the drug that disrupt operant responding (36). However, there can still be disruption in response rates under second-order schedules at high doses per injection, unless all the injections occur at the end of the session. These schedules have predictive validity of drug abuse potential, because performance in second-order schedules is maintained by injections (intravenous, intramuscular, or oral) of a variety of drugs that are abused by humans, with the animals exhibiting similar behavioral patterns in second-order schedules that terminate in drug injections.

Similar advantages can be observed for the models of conditioned negative reinforcing effects of withdrawal. These paradigms are reliable measures of the negative reinforcing effects of drug dependence and may have predictive validity as measures of protracted abstinence. Future studies will be required to establish this relationship. In addition, in all of these procedures the stimuli have taken on secondary reinforcing properties and as such can profit from much of what is known about the experimental analysis of behavior as it relates to primary reinforcers.

The disadvantages of all of these procedures is that they involve extensive training procedures and significant learning. Thus, treatments or manipulations that affect acquisition of new associations will disrupt the development of the conditioned reinforcing effects of drugs. The challenge of future behavioral design will be to explore means of separating these hypothetical constructs, both at the behavioral and the neurobiological levels of analysis.

ANIMAL MODELS OF ACQUISITION OF DRUG-SEEKING BEHAVIOR

The focus of this chapter has been on the animal models for established drug dependence and more specifically on recent studies in rodent models. However, there are numerous models for studying the acquisition of drug seeking behavior that involve many of the paradigms and procedures used in studying the acquisition of responding for other reinforcers. Significant success has been obtained in studying the acquisition of stimulant self-administration using smaller doses and in rodents, a nose poke response that has a high baseline frequency of emission (54).

Acquisition of drug self-administration also recruits variables covered in other chapters of this volume that can contribute to vulnerability for drug-seeking behavior, such as drug tolerance, drug sensitization (57), environmental variables (8), and genetic variables (see Intracellular Messenger Pathways as Mediators of Neural Plasticity, Adaptive Processes Regulating Tolerance to Behavioral Effects of Drugs, and Autism and Pervasive Developmental Disorders, Tic Disorders, Childhood Anxiety Disorders, Cocaine, Caffeine-A Drug of Abuse?, Chronic Amphetamine Use and Abuse, Pathophysiology of Tobacco Dependence, Opioids, Marijuana, and Phencyclidine).

SUMMARY AND CONCLUSIONS
Significance of Animal Models of Addiction

The importance of animal models of drug addiction is the assumption that findings observed in the animal models will have relevance to the social problems of drug abuse and addiction in man (60). Certainly, animal models of drug addiction can predict abuse potential (see the next section). However, the value of these animal models goes far beyond just the capacity to predict abuse potential. Animal models of drug addiction can provide a means of studying both the behavioral and biological basis of drug addiction. Factors involved in acquisition, maintenance, extinction, and reinstatement of drug reinforcement can carefully be extracted in laboratory-controlled situations. The neurobiological mechanisms involved in the positive reinforcing effects of drugs and the negative reinforcing effects of drug abstinence can and are being elucidated. Perhaps even more importantly, the environmental, behavioral, and neurobiological factors that contribute to individual differences in vulnerability to drug addiction can be explored with animal models. Finally, studies of the antecedents, mechanisms, and consequences of drug addiction using animal models provides a window on the antecedents, mechanisms, and consequences of other hypothetical constructs, such as emotions, that have long intrigued behavioral biologists (4).

Validity of Existing Animal Models

Most of the animal models discussed above appear to have predictive validity and are reliable. For the positive reinforcing effects of drugs, drug self-administration, ICSS, and conditioned place preference have been shown to have predictive validity for abuse potential of drugs. Drug discrimination has predictive validity for abuse potential of drugs indirectly through generalization to the training drug. Animal models of conditioned drug effects are successful in predicting the potential for conditioned drug effects in humans. In a limited number of examples, the negative reinforcing properties of drug abstinence appear to have predictive validity for the negative reinforcing properties of abstinence in humans. Predictive validity is more problematic for such concepts as craving, largely due to the inadequate formulation of the concept of craving in humans to date (48). Virtually all of the measures described here, with the possible exception of place preference and place aversion, have demonstrated reliability. Consistency and stability of the measures, small within-subject and between-subject variability, and reproducibility of the phenomenon is a characteristic of most of the measures employed in animal models of dependence.

Future Research

Most of the experimental work on the aversive motivational effects of drugs have been focused on the opiate model, but abstinence from opiates, stimulants, and alcohol result in aversive motivational states characterized by increases in behavioral responsiveness to stressors in animal models of anxiety, increases in reward thresholds, disruptions in motivated behavior, and conditioned place aversions. This (presumably aversive) motivational state serves as a dysregulator of motivational homeostasis and thus provides a mechanism for a negative-reinforcement process in which the organism administers the drug to alleviate the aversive withdrawal state.

Clearly, much remains to be explored about the neurobiological mechanisms of the unconditioned positive and negative motivational state(s), and in particular the conditioned positive and negative motivational state(s) associated with drug use and withdrawal. The study of the changes in the central nervous system that are associated with these homeostatic dysregulations may provide not only the key to drug dependence, but also the key to the etiology of psychopathologies associated with anxiety and affective abnormalities. The animal models described above provide a basis to begin such studies; however, a rich behavioral literature exists from which to draw even more creative and innovative models of the behavioral processes that form the syndrome of drug dependence.

published 2000