TY - JOUR

T1 - Joint universal lossy coding and identification of stationary mixing sources with general alphabets

AU - Raginsky, Maxim

N1 - Funding Information:
Manuscript received December 19, 2006; revised January 08, 2009. Current version published April 22, 2009. This work was supported by the Beckman Institute Fellowship. The material in this paper was presented in part at the IEEE International Symposium on Information Theory, (ISIT) Nice, France, June 2007.

PY - 2009

Y1 - 2009

N2 - In this paper, we consider the problem of joint universal variable-rate lossy coding and identification for parametric classes of stationary β-mixing sources with general (Polish) alphabets. Compression performance is measured in terms of Lagrangians, while identification performance is measured by the variational distance between the true source and the estimated source. Provided that the sources are mixing at a sufficiently fast rate and satisfy certain smoothness and Vapnik-Chervonenkis (VC) learnability conditions, it is shown that, for bounded metric distortions, there exist universal schemes for joint lossy compression and identification whose Lagrangian redundancies converge to zero as √ Vn log n/n as the block length n tends to infinity, where Vn is the VC dimension of a certain class of decision regions defined by the n-dimensional marginal distributions of the sources; furthermore, for each n, the decoder can identify O(√ Vn log n/n-dimensional marginal of the active source up to a ball of radius O(√ Vn log n/n) in variational distance, eventually with probability one. The results are supplemented by several examples of parametric sources satisfying the regularity conditions.

AB - In this paper, we consider the problem of joint universal variable-rate lossy coding and identification for parametric classes of stationary β-mixing sources with general (Polish) alphabets. Compression performance is measured in terms of Lagrangians, while identification performance is measured by the variational distance between the true source and the estimated source. Provided that the sources are mixing at a sufficiently fast rate and satisfy certain smoothness and Vapnik-Chervonenkis (VC) learnability conditions, it is shown that, for bounded metric distortions, there exist universal schemes for joint lossy compression and identification whose Lagrangian redundancies converge to zero as √ Vn log n/n as the block length n tends to infinity, where Vn is the VC dimension of a certain class of decision regions defined by the n-dimensional marginal distributions of the sources; furthermore, for each n, the decoder can identify O(√ Vn log n/n-dimensional marginal of the active source up to a ball of radius O(√ Vn log n/n) in variational distance, eventually with probability one. The results are supplemented by several examples of parametric sources satisfying the regularity conditions.

KW - Learning

KW - Minimum-distance density estimation

KW - Two-stage codes

KW - Universal vector quantization

KW - Vapnik-Chervonenkis (VC) dimension

UR - http://www.scopus.com/inward/record.url?scp=65749119760&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65749119760&partnerID=8YFLogxK

U2 - 10.1109/TIT.2009.2015987

DO - 10.1109/TIT.2009.2015987

M3 - Article

AN - SCOPUS:65749119760

VL - 55

SP - 1945

EP - 1960

JO - IRE Professional Group on Information Theory

JF - IRE Professional Group on Information Theory

SN - 0018-9448

IS - 5

ER -