Developers Notebook-RLE

From WxWiki

Jump to: navigation, search

Run Length Encoding (RLE)

Part of Developers_Notebook-RLE-Compression Methods

<CHARACTER, DLE, #OFREPEATS>

Compacts run length sequences - I.E.

"aaabb"

becomes

3a2b

Or in most implementations

|Character|DLE|#Of occurances|Character|DLE|#Of occurances| |a|0x90 |3|b|0x90|2|

The DLE is usually 0x90

...What does DLE mean anyway?:\?

Sometimes, if RLEing structured data, it is best to RLE each field; e.g. to compress a console window (where each character has both value and colour) works best to RLE the colour and RLE the value separately.

There are two styles of storing the encoding: chunked and escaped

  • Escaped means that a special sequence in the input indicates that a RLE block follows; this precludes the escape character being in the input otherwise, unless it is itself escaped!
  • Chunked means that each sequence is prefixed by length and type. There are generally two types: not compressed, and RLE compressed. Other compression schemes can be incorporated, however.
    • For encoding short sequences with these two schemes, you might use the high bit to flag whether the chunk is encoded or otherwise.
      • If RLE encoded, the low 7 bits might indicate the number of repeats, and the next byte might be the character to repeat. In this way, an encoded sequence takes 2 bytes of storage, and can represent a sequence of upto 127 bytes (slightly more if you start at 2 (the length of the chunk etc (SOMEONE work out the extra please)).
      • A non-encoded chunk might use the low 7 bits to say how long the unencoded sequence is, and then this is followed by those bytes.
Personal tools