Reinforcement Learning

Reinforcement Learning#

Introduction to RL#

Reinforcement learning (RL) is a type of machine learning technique where an agent learns to perform a task by interacting with a dynamic environment. Combinations of states and actions are associated with reward (or punishment). This learning approach enables the agent to make a series of decisions that maximize expected reward.

One branch of RL focuses on computer algorithms learning to solve problems without human intervention and without being explicitly programmed to achieve the task. RL has been used since the 1980s to train agents to perform many kinds of tasks. Self-driving cars are a prominent example. At its heart is error-driven learning, which is used to update internal value representations associated with states and actions.

With advances in computer power and computational efficiency in updating multi-layer networks. Reinforcement learning is also applied in deep neural networks to train modern AI models. A major advance in large language models (LLMs) occurred when adding RL to deep neural networks trained on word and sentence prediction. This resulted in ChatGPT in 2023. Programs trained with reinforcement learning can beat the best human players in games like Go and poker, as well as video games.

Another branch focuses on modeling how human or nonhuman animals learn and make decisions. The latter branch is of primary interest in psychology and neuroscience. Examples include:

Modeling the process of reward and punishment learning in classical and operant conditioning (associated with studies of learning and Decision Science)
Estimating parameters associated with hidden processes that are not directly observable (e.g., learning and forgetting rates)
Relating model parameters to clinical characteristics and studying the effects of treatment (associated with the field of Computational Psychiatry and Computational Cognitive Neuroscience)
Relating model parameters and/or estimates of hidden states to measures of brain activity (associated with the fields of Cognitive, Affective and Decision Neuroscience)

For a summary of RL algorithms and the fundamental underlying equations, see also:

https://github.com/FrancescoSaverioZuppichini/Reinforcement-Learning-Cheat-Sheet/blob/master/rl_cheatsheet.pdf

https://roboticsbiz.com/wp-content/uploads/2021/11/Reinforcement-Learning-Cheatsheet.pdf

Guest lecture: Alireza Soltani#

Stay tuned for slides

Hands-on tutorials#

Hands-on 1: Introduction#

Guest instructor: Aryan Yazdanpanah

Download the Matlab live script

Hands-on 2: Q-learning#

Guest lecture: Heejung Jung

Q-learning tutorial using Open Ai’s code:

Here, we walk through a Q-learning tutorial developed by Joy Zhang [ link ]. It covers a random agent versus a Q-learning agent in a tax driving environment, and beautifully walks through each component of the Q-learning model.
Note, the tutorial is from 2021; there have been some updates in OpenAI’s gym since 2021. It’s now called Gymnasium with updates! This means we need to make slight updates to Zhang’s tutorial.. which I’ve done in my edits!
So checkout the original blog, but also follow this colab, Heejung’s edits to Zhang’s tutorial, which allows you to run the code without debugging.

Links:

Q-learning tutorial by Joy Zhang: https://www.gocoder.one/blog/rl-tutorial-with-openai-gym/
Heejung’s edits to Zhang’s tutorial
OpenAI’s website (RL intro): https://spinningup.openai.com/en/latest/spinningup/rl_intro.html
Learn more about OpenAI’s Taxi environment: https://www.gymlibrary.dev/environments/toy_text/taxi/

Additional RL resources are hyperlinked in this document or linked here in Heejung’s github

Hands-on 3: Modeling cue effects on pain#

(and effort and vicarious pain) in SpaceTop

Guest instructor: Aryan Yazdanpanah and Heejung

Download the SpaceTop CueRL sample dataset

Download the Matlab live script

More resources#

Videos lectures#