TY - JOUR
T1 - Joint fixed-rate universal lossy coding and identification of continuous-alphabet memoryless sources
AU - Raginsky, Maxim
N1 - Funding Information:
Manuscript received December 3, 2005; revised May 17, 2007. This work was supported by the Beckman Institute Fellowship. The material in this paper was presented in part at the IEEE International Symposium on Information Theory, Seattle, WA, July 2006.
PY - 2008/7
Y1 - 2008/7
N2 - The problem of joint universal source coding and density estimation is considered in the setting of fixed-rate lossy coding of continuous-alphabet memoryless sources. For a wide class of bounded distortion measures, it is shown that any compactly parametrized family of ℝd-valued independent and identically distributed (i.i.d.) sources with absolutely continuous distributions satisfying appropriate smoothness and Vapnik-Chervonenkis (VC) learnability conditions, admits a joint scheme for universal lossy block coding and parameter estimation, such that when the block length n tends to infinity, the overhead per-letter rate and the distortion redundancies converge to zero as O(n-1log n) and O(√n-1log n), respectively. Moreover, the active source can be determined at the decoder up to a ball of radius O(√n-1log n) in variational distance, asymptotically almost surely. The system has finite memory length equal to the block length, and can be thought of as blockwise application of a time-invariant nonlinear filter with initial conditions determined from the previous block. Comparisons are presented with several existing schemes for universal vector quantization, which do not include parameter estimation explicitly, and an extension to unbounded distortion measures is outlined. Finally, finite mixture classes and exponential families are given as explicit examples of parametric sources admitting joint universal compression and modeling schemes of the kind studied here.
AB - The problem of joint universal source coding and density estimation is considered in the setting of fixed-rate lossy coding of continuous-alphabet memoryless sources. For a wide class of bounded distortion measures, it is shown that any compactly parametrized family of ℝd-valued independent and identically distributed (i.i.d.) sources with absolutely continuous distributions satisfying appropriate smoothness and Vapnik-Chervonenkis (VC) learnability conditions, admits a joint scheme for universal lossy block coding and parameter estimation, such that when the block length n tends to infinity, the overhead per-letter rate and the distortion redundancies converge to zero as O(n-1log n) and O(√n-1log n), respectively. Moreover, the active source can be determined at the decoder up to a ball of radius O(√n-1log n) in variational distance, asymptotically almost surely. The system has finite memory length equal to the block length, and can be thought of as blockwise application of a time-invariant nonlinear filter with initial conditions determined from the previous block. Comparisons are presented with several existing schemes for universal vector quantization, which do not include parameter estimation explicitly, and an extension to unbounded distortion measures is outlined. Finally, finite mixture classes and exponential families are given as explicit examples of parametric sources admitting joint universal compression and modeling schemes of the kind studied here.
KW - Learning
KW - Minimum-distance density estimation
KW - Two-stage codes
KW - Universal vector quantization
KW - Vapnik-Chervonenkis (VC) dimension
UR - http://www.scopus.com/inward/record.url?scp=46849122718&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=46849122718&partnerID=8YFLogxK
U2 - 10.1109/TIT.2008.924669
DO - 10.1109/TIT.2008.924669
M3 - Article
AN - SCOPUS:46849122718
SN - 0018-9448
VL - 54
SP - 3059
EP - 3077
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 7
ER -