Approximating the smallest grammar: Kolmogorov complexity in natural models

Moses Charikar, Eric Lehman, Ding Liu, Rina Panigrahy, Manoj Prabhakaran, April Rasala, Amit Sahai, Abhi Shelat

Research output: Contribution to journalConference articlepeer-review


We consider the problem of finding the smallest context-free grammar that generates exactly one given string of length n. The size of this grammar is of theoretical interest as an efficiently computable variant of Kolmogorov complexity. The problem is of practical importance in areas such as data compression and pattern extraction. The smallest grammar is known to be hard to approximate to within a constant factor, and an o(log n/log log n) approximation would require progress on a long-standing algebraic problem [10]. Previously, the best proved approximation ratio was O(n1/2) for the BISECTION algorithm [8]. Our main result is an exponential improvement of this ratio; we give an O(log(n/g*)) approximation algorithm, where g* is the size of the smallest grammar. We then consider other computable variants of Kolomogorov complexity. In particular we give an O(log2 n) approximation for the smallest non-deterministic finite automaton with advice that produces a given string. We also apply our techniques to "advice-grammars" and "edit-grammars", two other natural models of string complexity.

Original languageEnglish (US)
Pages (from-to)792-801
Number of pages10
JournalConference Proceedings of the Annual ACM Symposium on Theory of Computing
StatePublished - 2002
Externally publishedYes
EventProceedings of the 34th Annual ACM Symposium on Theory of Computing - Montreal, Que., Canada
Duration: May 19 2002May 21 2002

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Approximating the smallest grammar: Kolmogorov complexity in natural models'. Together they form a unique fingerprint.

Cite this