Brain Dump

Competitive Neural Network

Tags
adaptive-intelligence

A unsupervised clustering implementation based around neural networks using the BCM update rule.

Formulation

  1. Assign \( x^{\mu} \longrightarrow w_k \) where \( w_k^T x^{\mu} \geq w_i^T x^{\mu} \forall i \). That is to say assign input sample \( x^{\mu} \) to the closest cluster centroid \( k \), which is equivalent to the cluster associated with the winning neuron \( k \).
  2. Update the weights of only the winning neuron \( k \) by: \( \Delta{w_k} = \alpha (x^{\mu} - w_k) \).
  3. Repeat this for all input patterns over multiple epochs until the weights don't change.

The term competitive-learning is used for this process because we're trying to maximise the [see page 23, variability] among neurons. Each neuron is competing for the right to allocate a data-point to its own cluster over any of the other neurons clusters; and only the winning neuron has it's weights updated.

Note: in this formulation we assume \( w_k^T x^{\mu} \geq w_i^T x^{\mu} \) is equivalent to \( |w_k - x^{\mu} \leq |w_i - {x}^{\mu}| \forall i \), which is true when the [see page 16, weights are normalised], otherwise the neuron with the largest weight would win regardless of similarity.

Output Exclusivity

For our neural-network based clustering model we want each input to cause exactly one output neuron to fire. The winning output neuron needs to be the only firing neuron. This can be [see page 22, done] by having inhibitory connections from each output neuron to every other output neuron and excitory connections from each output neuron to itself. When it excites it increases its own output and inhibits all its neighbors outputs. This is known as a winner take all circuit.

Implementation

  1. Generate a set of output neurons with random weights.
  2. Choose a random input pattern and calculate the activation of each output neuron.
  3. Detect and update the weights of only the winner neuron.
  4. Repeat from step 2 until the network converges (or a max number of iterations).

Difference From K-Means

Both Competitive-Learning and K-Means clustering are both equivalent in the general form. The difference lies in how they approach solving a clustering decision problem.

  • K-means approaches it from a geometrical POV (x is closest to cluster prototype Y, assign X to Y),
  • Competitive learning does so by choosing the output neuron with the largest output potential.

Competitive learning is clearly a biologically plausible learning algorithm, whereas K-means is not.

It's also worth noting that with Competitive Learning we're changing the weights of the network, whereas with K-means we're changing the cluster prototypes.

Potential Faults

Looking at the [see page 19, output] units of our clustering network we find some are:

UnitDescription
Similar UnitsRepresents two or more similar classes combined (example: a 5 and an 8) [1]
Dead UnitsDon't represent any classes, instead resembling white-noise [2]

[1]: The similar units problem is [see page 20, due] to our initial weight vector being sufficiently close to two different classes of vectors. Every time its updates to go towards one cluster a sample for the other cluster pushes it back in the other direction so by the end of the network training it detects a mix of both classes.

[2]: The dead-units problem is because an initial weight vector was too far from any cluster. It was never the winning unit for an input and thus never received any meaningful change in weights.

Links to this note