### Abstract

We consider the problem of finding the smallest context-free grammar that generates exactly one given string of length n. The size of this grammar is of theoretical interest as an efficiently computable variant of Kolmogorov complexity. The problem is of practical importance in areas such as data compression and pattern extraction. The smallest grammar is known to be hard to approximate to within a constant factor, and an o(log n/log log n) approximation would require progress on a long-standing algebraic problem [10]. Previously, the best proved approximation ratio was O(n^{1/2}) for the BISECTION algorithm [8]. Our main result is an exponential improvement of this ratio; we give an O(log(n/g*)) approximation algorithm, where g* is the size of the smallest grammar. We then consider other computable variants of Kolomogorov complexity. In particular we give an O(log^{2} n) approximation for the smallest non-deterministic finite automaton with advice that produces a given string. We also apply our techniques to "advice-grammars" and "edit-grammars", two other natural models of string complexity.

Original language | English (US) |
---|---|

Pages (from-to) | 792-801 |

Number of pages | 10 |

Journal | Conference Proceedings of the Annual ACM Symposium on Theory of Computing |

State | Published - Sep 23 2002 |

Event | Proceedings of the 34th Annual ACM Symposium on Theory of Computing - Montreal, Que., Canada Duration: May 19 2002 → May 21 2002 |

### ASJC Scopus subject areas

- Software

## Fingerprint Dive into the research topics of 'Approximating the smallest grammar: Kolmogorov complexity in natural models'. Together they form a unique fingerprint.

## Cite this

*Conference Proceedings of the Annual ACM Symposium on Theory of Computing*, 792-801.