问题 4514 --Lempel–Ziv–Welch

4514: Lempel–Ziv–Welch

时间限制: 1 Sec  内存限制: 128 MB
提交: 0  解决: 0
[提交][状态][讨论版][命题人:]

题目描述

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by
Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an
improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The
algorithm is simple to implement, and has the potential for very high throughput in hardware
implementations.
A high level view of the encoding algorithm is shown here:
1.Initialize the dictionary to contain all strings of length one.
2.Find the longest string W in the dictionary that matches the current input.
3.Emit the dictionary index for W to output and remove W from the input.
4.Add W followed by the next symbol in the input to the dictionary.
5.Go to Step 2.
A dictionary is initialized to contain the single-character strings corresponding to all the
possible input characters (and nothing else). The algorithm works by scanning through the
input string for successively longer substrings until it finds one that is not in the dictionary.
When such a string is found, the index for the string without the last character (i.e., the longest
substring that is in the dictionary) is retrieved from the dictionary and sent to output, and the
new string (including the last character) is added to the dictionary with the next available code.
The last input character is then used as the next starting point to scan for substrings.
In this way, successively longer strings are registered in the dictionary and made available for
subsequent encoding as single output values. The algorithm works best on data with repeated
patterns, so the initial parts of a message will see little compression. As the message grows,
however, the compression ratio tends asymptotically to the maximum.
In this problem we assume that all the impossible character is one of 'a' - 'z', the 26 English
letters in lower case. The dictionary index starts from 1.

输入

Multiple cases. There are one line for each case containing the encoded stream.

输出

One line for each case, the decoded string.

样例输入

1 2 3 4 5 6 7
1 2 27 27
1 2 3 27 29 28

样例输出

abcdefg
ababab
abcabcabc

来源

[提交][状态]