Brain Dump

Adaptive Modelling

Tags
text-processing

Is an approach to building a model for text compression where we build the model and apply it to compression in the same pass.

Method

To Encode:

  1. Begins with a base probability distribution.
  2. Refines the model as symbols are encountered (the text redefines the model).

To Decode:

  1. Decoder starts with the same PD.
  2. Decoder is decoding the same symbol sequence so it can refine in the same way.

See [see page 91, example] and practical [see page 87, example].

Issues

  • Must avoid predicting a character as having 0 probability just because it's not been encountered yet (underflow).
  • Cannot support random access to a compressed file... because the model depends on previously decoded data to decode later data.