Brain Dump

Adaptive Modelling

Tags: text-processing

Is an approach to building a model for text compression where we build the model and apply it to compression in the same pass.

Method

To Encode:

Begins with a base probability distribution.
Refines the model as symbols are encountered (the text redefines the model).

To Decode:

Decoder starts with the same PD.
Decoder is decoding the same symbol sequence so it can refine in the same way.

See [see page 91, example] and practical [see page 87, example].

Issues

Must avoid predicting a character as having 0 probability just because it's not been encountered yet (underflow).
Cannot support random access to a compressed file... because the model depends on previously decoded data to decode later data.