Brain Dump

Deep Reinforcement Learning

Tags
adaptive-intelligence

Using deep neural-networks in a reinforcement neural network problem. This allows us to encode more complex states in our input and therefore learn and respond to more distinct behaviour.

The idea behind [see page 4, deep RL] is to combine deep learning and Temporal Difference learning.

Can be seen as taking supervised learning concepts into reinforcement learning.

Our [see page 6, goal] is to learn the weights that will best learn the expected rewards Q. I.E. minimise the error between our current prediction and the received reward.

TODO: Include [see page 7, math].

Can be implemented using a deep feed-forward neural network, however instead of using the network to classify our inputs we use it to measure the expected reward of an action (which we then use to choose an action).

We'll probably want a linear activation function to make associating actions with results easier.

Note: In slide 9 we define H as the RELU derivative.