TY - GEN
T1 - Fast nonparametric conditional density estimation
AU - Holmes, Michael P.
AU - Gray, Alexander G.
AU - Isbell, Charles Lee
PY - 2007
Y1 - 2007
N2 - Conditional density estimation generalizes regression by modeling a full density f(y|x) rather than only the expected value E(y|x). This is important for many tasks, including handling multi-modality and generating prediction intervals. Though fundamental and widely applicable, nonparametric conditional density estimators have received relatively little attention from statisticians and little or none from the machine learning community. None of that work has been applied to greater than bivariate data, presumably due to the computational difficulty of data-driven bandwidth selection. We describe the double kernel conditional density estimator and derive fast dual-tree-based algorithms for bandwidth selection using a maximum likelihood criterion. These techniques give speedups of up to 3.8 million in our experiments, and enable the first applications to previously intractable large multivariate datasets, including a redshift prediction problem from the Sloan Digital Sky Survey.
AB - Conditional density estimation generalizes regression by modeling a full density f(y|x) rather than only the expected value E(y|x). This is important for many tasks, including handling multi-modality and generating prediction intervals. Though fundamental and widely applicable, nonparametric conditional density estimators have received relatively little attention from statisticians and little or none from the machine learning community. None of that work has been applied to greater than bivariate data, presumably due to the computational difficulty of data-driven bandwidth selection. We describe the double kernel conditional density estimator and derive fast dual-tree-based algorithms for bandwidth selection using a maximum likelihood criterion. These techniques give speedups of up to 3.8 million in our experiments, and enable the first applications to previously intractable large multivariate datasets, including a redshift prediction problem from the Sloan Digital Sky Survey.
UR - https://www.scopus.com/pages/publications/57349197333
UR - https://www.scopus.com/pages/publications/57349197333#tab=citedBy
M3 - Conference contribution
AN - SCOPUS:57349197333
SN - 0974903930
SN - 9780974903934
T3 - Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, UAI 2007
SP - 175
EP - 182
BT - Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, UAI 2007
T2 - 23rd Conference on Uncertainty in Artificial Intelligence, UAI 2007
Y2 - 19 July 2007 through 22 July 2007
ER -