A reinforcement learning approach to instrumental contingency degradation in rats - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Article Dans Une Revue Journal of Physiology - Paris Année : 2011

A reinforcement learning approach to instrumental contingency degradation in rats

Résumé

Goal-directed action involves a representation of action consequences. Adapting to changes in action-outcome contingency requires the prefrontal region. Indeed, rats with lesions of the medial prefrontal cortex do not adapt their free operant response when food delivery becomes unrelated to lever-pressing. The present study explores the bases of this deficit through a combined behavioural and computational approach. We show that lesioned rats retain some behavioural flexibility and stop pressing if this action prevents food delivery. We attempt to model this phenomenon in a reinforcement learning framework. The model assumes that distinct action values are learned in an incremental manner in distinct states. The model represents states as n-uplets of events, emphasizing sequences rather than the continuous passage of time. Probabilities of lever-pressing and visits to the food magazine observed in the behavioural experiments are first analyzed as a function of these states, to identify sequences of events that influence action choice. Observed action probabilities appear to be essentially function of the last event that occurred, with reward delivery and waiting significantly facilitating magazine visits and lever-pressing respectively. Behavioural sequences of normal and lesioned rats are then fed into the model, action values are updated at each event transition according to the SARSA algorithm, and predicted action probabilities are derived through a softmax policy. The model captures the time course of learning, as well as the differential adaptation of normal and prefrontal lesioned rats to contingency degradation with the same parameters for both groups. The results suggest that simple temporal difference algorithms with low learning rates can largely account for instrumental learning and performance. Prefrontal lesioned rats appear to mainly differ from control rats in their low rates of visits to the magazine after a lever press, and their inability to initially detect weak contingency changes.

Dates et versions

hal-00642715 , version 1 (18-11-2011)

Identifiants

Citer

Alain Dutech, Etienne Coutureau, Alain Marchand. A reinforcement learning approach to instrumental contingency degradation in rats. Journal of Physiology - Paris, 2011, Computational Neuroscience: Neurocomp 2010, 105 (1-3), pp.36-44. ⟨10.1016/j.jphysparis.2011.07.017⟩. ⟨hal-00642715⟩
90 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More