Brain Dump

Hebbian Learning

Tags
adaptive-intelligence

Is a form of unsupervised learning based on neural networks using one of a class of learning-rules known as Hebbian rules.

Hebbian learning works in [see page 15, terms] of the spike rate of a neuron, with highly correlated input spikes leading to output spikes. The thing hebbian learning is most concerned with is therefore correlation.

For a neuron \( A \) with \( N \) pre-synaptic neurons we [see page 16, define]:

\begin{align*} \vec{X}^i = \begin{pmatrix}

  v^i\_1 \\\\
  v^i\_2 \\\\
  \vdots \\\\
  v^i\_N \\\\

\end{pmatrix} \end{align*}

To be the vector of potential-input to \( A \) for each of its \( N \) pre-synaptic neurons for a time-slice \( i \). This matrix will be used in .

Neuron Correlation

Consider the minimal hebbian rule from earlier and substitute the equation for a neurons output potential:

\begin{align} \label{eq:hebb-rule} \Delta{w_{ij}} &= \alpha v_i^{\text{post}} v_j^{\text{pre}} \\

        &= \alpha (\sum\_{k} {w}\_{ik} v\_k^{\text{pre}}) v\_j^{\text{pre}} \\\\

\end{align}

If we average eq:hebb-rule to find the average change in weights after sending every input-sample through the network we find the average learning-rule:

\begin{align} \langle \Delta{w_{ij}} \rangle &= \frac{1}{P} \sum_{\mu = 1}^{P} \Delta{w_{ij}^{\mu}} \\

            &= \frac{1}{P} \sum\_{\mu = 1}^{P} \alpha (\sum\_{k} {w}\_{ik} v\_k^{\mu}) v\_j^{\mu} \\\\
            &= \alpha \sum\_{k} {w}\_{ik} \frac{1}{P} \sum\_{\mu = 1}^{P} v\_k^{\mu} v\_j^{\mu} \label{eq:hebb-correlation}

\end{align}

Observe that the final form in the equation above matches word-for-word the correlation formula. This means that that weight changes are [see page 10, driven] by correlations on input (because they depend on the average of the correlation matrix \( \)) for all input neurons.

Note eq:hebb-correlation has us:

  • Move α to the start because it doesn't depend on any of the loops in the formula
  • Swap the position of the \( \sum_^{P} \) with \( \sum_{k} \)
  • Moved the \( w_{ik} \) outside of the \( \sum_^{P} \) because it doesn't depend on it.

Links to this note