We consider the problem of finding the smallest context-free grammar that generates exactly one given string of length n. The size of this grammar is of theoretical interest as an efficiently computable variant of Kolmogorov complexity. The problem is of practical importance in areas such as data compression and pattern extraction. The smallest grammar is known to be hard to approximate to within a constant factor, and an o(log n/log log n) approximation would require progress on a long-standing algebraic problem . Previously, the best proved approximation ratio was O(n1/2) for the BISECTION algorithm . Our main result is an exponential improvement of this ratio; we give an O(log(n/g*)) approximation algorithm, where g* is the size of the smallest grammar. We then consider other computable variants of Kolomogorov complexity. In particular we give an O(log2 n) approximation for the smallest non-deterministic finite automaton with advice that produces a given string. We also apply our techniques to "advice-grammars" and "edit-grammars", two other natural models of string complexity.
|Original language||English (US)|
|Number of pages||10|
|Journal||Conference Proceedings of the Annual ACM Symposium on Theory of Computing|
|State||Published - 2002|
|Event||Proceedings of the 34th Annual ACM Symposium on Theory of Computing - Montreal, Que., Canada|
Duration: May 19 2002 → May 21 2002
ASJC Scopus subject areas