Top Qs
Timeline
Chat
Perspective
Grammar-based code
Lossless data compression algorithm From Wikipedia, the free encyclopedia
Remove ads
Remove ads
Grammar-based codes or grammar-based compression are compression algorithms based on the idea of constructing a context-free grammar (CFG) for the string to be compressed. Examples include universal lossless data compression algorithms.[1] To compress a data sequence , a grammar-based code transforms into a context-free grammar . The problem of finding a smallest grammar for an input sequence (smallest grammar problem) is known to be NP-hard,[2] so many grammar-transform algorithms are proposed from theoretical and practical viewpoints. Generally, the produced grammar is further compressed by statistical encoders like arithmetic coding.

Remove ads
Examples and characteristics
The class of grammar-based codes is very broad. It includes block codes, the multilevel pattern matching (MPM) algorithm,[3] variations of the incremental parsing Lempel-Ziv code,[4] and many other new universal lossless compression algorithms. Grammar-based codes are universal in the sense that they can achieve asymptotically the entropy rate of any stationary, ergodic source with a finite alphabet.
Remove ads
Practical algorithms
The compression programs of the following are available from external links.
- Sequitur[5] is a classical grammar compression algorithm that sequentially translates an input text into a CFG, and then the produced CFG is encoded by an arithmetic coder.
- Re-Pair[6] is a greedy algorithm using the strategy of most-frequent-first substitution. The compressive performance is powerful, although the main memory space requirement is very large.
- GLZA,[7] which constructs a grammar that may be reducible, i.e., contain repeats, where the entropy-coding cost of "spelling out" the repeats is less than the cost creating and entropy-coding a rule to capture them. (In general, the compression-optimal SLG is not irreducible, and the Smallest Grammar Problem is different from the actual SLG compression problem.)
Remove ads
See also
References
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads