Learning to stop: two types of neurons cooperate to adjust behaviour
Updating learning to control our actions is a fundamental aspect of brain function. It eliminates behaviours that use valuable energy for no reward. Here, we discover that two interlaced brain cell types allow this change in performance.
We continuously learn about the consequences of our actions and change our behaviour accordingly so we can in future repeat a response that produced the desired outcome and hold back learned behaviours that are no longer appropriate. Imagine you are hungry and go to the vending machine to purchase a snack. As usual, you insert enough money, select the product to buy by pressing a button and wait for the machine to dispense it to you. However, this time nothing comes out! You might initially try inserting more coins or shaking the machine furiously, but eventually you would just stop trying and walk away. More importantly, what are the chances that you will go back to the same machine and engage in the same sequence of actions the following days? Very low, as hopefully you will remember that your efforts will likely be unsuccessful.
This ordinary situation reveals a naturally-occurring process of behaviour change called extinction by which a learned response can be eliminated when no longer leads to a reward. Extinction is used in therapies designed to get rid of unhealthy behaviours, such as smoking and overeating. But learned behaviours are not like dinosaurs: when they are extinguished, they are not lost forever. Instead, they remain largely intact in our memory, though silenced. The process by which extinction inhibits unwanted learned behaviours (rather than erasing them) has been extensively studied in experimental psychology, but little is known about the underlying neurobiological mechanism.
To investigate how extinction learning is encoded in the brain, we first trained mice to perform an action (press a lever) to receive an outcome (food pellet) - a laboratory task equivalent to the vending machine example. Over time, mice became experts at obtaining food rewards. In order to induce extinction learning, we exposed a group of mice to a test session in which food pellets were suddenly no longer delivered following a lever press. Similar to the frustrating attempt of trying to get a snack from the broken vending machine, mice initially pressed the lever more vigorously, but rapidly stopped working for their rewards. The following day, the mice were not interested in pressing the lever anymore, which demonstrated that their initial behaviour had been indeed extinguished.
To understand the neural basis of this extinction, we studied a part of the brain called the striatum, which plays a major role in purpose-driven learning. The striatum is largely composed of two types of principal neurons: D1 and D2. We know that D1 neurons store learned behaviours, because their experimental activation can elicit responses, such as running. However, little is known about the role of D2-type neurons in learning, specifically in relation to extinction learning. To test their involvement, we looked into the brain of trained mice and located D1 and D2 neurons that had been activated across the striatum as a result of the animal's learning experience.
We found that, in animals of the control group (who press the lever and received reward), activated neurons were distributed in separate patches of either D1 or D2-type. However, in mice who underwent extinction, activated D1 and D2 neurons occupied the same area within the striatum, thus defining a hotspot for behavioural change. Surprisingly, when D2 neurons were genetically removed from these hotspots, mice were unable to adjust their responses, and kept repeating the action that no longer secured a food reward. This shows that in order to change a learned behaviour, D2 neurons need to interact with D1 neurons that encode it. These results demonstrate for the first time that D1 and D2 neurons work cooperatively in specific areas of the striatum to appropriately silence unwanted actions, which is vital to adaptive behaviour.
For decades, D1 and D2 neurons were thought to control striatal function in opposing ways, but our findings add support to the idea that they may instead act together. Through the local interaction between these two types of neurons, new information can be added to pre-existing learning. This D2-to-D1 influence allows to integrate important new variables in the environment (no snacks here) thus sparing pointless efforts (insert the money, then press the button) and granting access to a whole new set of adaptive actions (pizzeria might be open).
Dr. Ayala Sela , Associate Editor
We thought you might like
The lingering effects of parental care and its role in evolutionary changeJan 27, 2016 in Evolution & Behaviour | 4 min read by Rebecca Kilner
More from Neurobiology
How to counteract age when the nervous system is damagedMay 11, 2021 in Neurobiology | 4.5 min read by Nathan J. Michaels , Jason R. Plemel
How our brain temporally organizes our memories of past eventsMay 7, 2021 in Neurobiology | 3.5 min read by Elena Delfino