LZW reads in bytes one at a time in a file. It then learns of patterns in the file and builds a dictionary (hash table) of the patterns. So for example if the word the is used a lot in a file at some point that word will get a around 9 or so bits that can be used to present those 3 bytes.
LZW sees if a pattern is in the dictionary, if it is, it keeps reading in more bytes until a new pattern is found. The inital dictionary contains all the extended ASCII characters.
So for example the string ABBABABAC
At each step, read in a char. | Test String | String left to read | Output String Built | Dictionary |
1 | NULL | ABBABABAC | NULL | |
2 | A | BBABABAC | NULL | A is already in the dictionary |
3 | AB | BABABAC | A | AB is added as 256 |
4 | BB | ABABAC | AB | BB is added to the dictionary as 257 |
5 | BA | BABAC | ABB | BA is added to the dictionary as 258 |
6 | AB | ABAC | ABB | AB is already in the dictioary, so you read another char. |
7 | ABA | BAC | ABB(256) | Add ABA to the dictionary as 259. Make use of 256 for your output of AB. |
8 | AB | AC | ABB(256) | AB is aleady in the dictionary. Read more. |
9 | ABA | C | ABB(256) | AB is aleady in the dictionary. Read more. |
10 | ABAC | NULL | ABB(256)(259)C | Add in to the dictionary ABAC as 260. |
Final Dictionary: 0-255 extended ASCII
256 AB
257 BB
258 BA
259 ABA
260 ABAC