Brain Dump

Block Processing

Tags
speech-processing

Is a form of processing where we take overlapping blocks of a signal at different offsets. Each block can include some samples from the previous block and some from the next block.

overlapping frames, where each frame is partially included in the previous and successive frame. You can calculate the [see page 2, overlap] like so. See [see page 3, here] for some more calculations.

We define:

TermSymbolMeaning
Frame Size\(N\)Number of samples per block
Frame Shift\(R\)The offset (number of samples) from the end of one block to the start of the next

Note: Frame size is often expressed in time eg. NT seconds (for sample period T).

Note: Frame shift is often expressed as frame rate eg. 1/RT frames-per-second.

Calculating the Blocks (OLA)

Assume we have a sequence X which we pass through a convolution-sum using a filter (impulse-response) H to get a sequence Y. We can then convert Y to a series of blocks.

Firstly notice how Y is the sum of a bunch of convolutions. We can [see page 10, break] this down into the sum of the convolution of a bunch of blocks.

If we sum the heights of each point in each block we get back our original signal Y back. This process is called the Overlap Add Method (OLA).

Weighted Overlap Add (WOLA)

Consider we [see page 12, convolve] the same block as in the previous example with two different impulse responses. The second is noticeably different to the first, causing the output to show a [see page 13, sudden and rapid] change in amplitude,

Warn: Sudden changes in amplitude are clearly perceivable and should be avoided.

We can [see page 12, weight] the input signal with a windowing function to smooth the output.

Links to this note