In past posts we have discussed the benefits of balanced approach to training. This post today is aimed at providing guidance on the appropriate use of positive punishment, ie corrections for non-compliance.
In the last post we discussed the problems associated with Positive Reinforcement only training. The following points identify the two main types of Punishment from a scientific perspective.
Positive Punishment - The giving of something unpleasant to assist in the reduction of an undesired action
Negative Punishment - The removal of something pleasant to assist in the reduction of an undesired action
You will have noticed that one involves the administering of a physical correction or something the dog finds unpleasant or aversive, while the other involves the withholding of something the dog finds pleasant or appetitive.
The way dogs are hard wired in life is to experiment, test, respond to a given stimulus and as a result derive a reaction to that stimulus based on the cause and effect outcome. Risk versus reward. Quite simply, the way a dog responds to a given stimulus is based on what the outcome was as a result of the interaction. If the outcome was pleasing (appetitive) the behaviour will be reinforced. If the outcome was unpleasant (aversive) the behaviour will not be reinforced and may be reduced subject to how aversive the outcome was to the dog.
Where most people fail in this entire scenario is misunderstanding that a dog can find something appetitive even if there is a degree of discomfort in meeting its intention to respond to that stimulus. This is heightened where the behaviour around the stimulus is instinctive. (I will leave the instinctive responses to stimuli and subsequent effect on a dog for a later post, as it is quite detailed and lengthy). If a dog shows interest in a stimulus, the intensity of response to that stimulus is critical in determining how and when to administer punishment. Most people that have been conditioned to not administer punishment will be happy by simply using voice as part of discouraging that behaviour. While I would agree that this can work with some soft or timid dogs, exhibiting low intensity behaviour, this reaction in general will not work with typical dogs in normal response situations let alone dogs of more intense temperament and higher intensity behaviour.
If a dog finds a certain behaviour appetitive and the motivation for ceasing that behaviour is not sufficiently strong enough, the behaviour will not be terminated. In fact the dog may have an enhanced response to that stimulus or what is classified as an behavioural escalation to that stimulus.
This is where this post can become quite technical so I will try to keep it as simple as possible. In order to terminate the behaviour the dog must find meaning in the aversive aspect of interaction. For example, if a dog likes honey and up to a given time is not stung by raiding a bee hive, the behaviour is reinforced given the honey is the appetitive outcome. If one day the dog is stung by one bee the behaviour may or may not be terminated as the benefit of taking the honey may be worth dealing with the one sting. However, if the dog is stung by multiple bees at once, the balance may shift to making the dog believe that the outcome is not worth the potential benefit. In this case the reward is not worth the risk.
This is what is classified as an effective correction. The purpose of an effective correction is to change the behaviour to a degree in the desired direction. It happens to dogs every day in the course of their day to day lives. By not using this tool in our kit to train dogs we are missing one big part of the process.
The intention of punishment is to change the behaviour, and if possible by that one action, and not by numerous or multiple lower level or weaker corrections which by their nature will desensitise the dog progressively. Just how this is done depends on the dog, the intensity of behaviour, the temperament of the dog and most of all timing. In fact quite the opposite can be beneficial, in that the correction should sensitise the dog in specific situations.
The ways dogs actually balance risk and reward in their daily exploration is by determining what is in it for them based on previous experience with that situation. The greater the distance between what the dogs finds appetitive for a given behaviour as opposed to what the dog finds aversive a given behaviour makes it very clear to the dog where its advantage rests in the interaction with that situation. This is a fundamental part of balanced dog training.
Let me draw a very simple example for positive punishment we can all understand.
A dog knows how to drop but elects not to. In simply saying drop again without consequence, the dog not only has no negative effect for non-compliance, but has also learned that it is able to ignore us in future. If we administer some form of positive punishment for non-compliance there is now a direct effect for non-compliance that the dog can relate to. We now have an aversive extreme for the dog. If the dog complies when told to drop, and on compliance is released and played with, with genuine enthusiasm, there is now a direct effect for compliance that the dog can relate to. We now have an appetitive extreme for the dog. With this information in hand, the dog is now able to make a decision to comply or not based on risk and reward. Very simple, and very effective.
Let us use an example of where negative punishment can be very useful.
Imagine taking your dog for a walk, when the dog is expecting it. You tell the dog to drop and it doesn't. . In simply saying drop again without consequence, the dog not only has no negative effect for non-compliance, but has also learned that it is able to ignore us in future. If you use negative punishment, and put the dog back into the back yard and not take it for a walk, the dog has learned that non-compliance has now meant it has lost the walk it was expecting. The negative aspect here is the loss of the walk which the dog found very appetitive. This may need to be repeated after say 20mins to drive the message home to the dog. If the dog drops the walk is on offer, In this instance we also have the two extremes for compliance and non-compliance.
As advised previously, a balanced approach to risk and reward for the dog is the optimum way to obtain predictability in response.
In future posts will be on schedules of reinforcement for compliance and non-compliance.