TY - CONF
T1 - Model-augmented conditional mutual information estimation for feature selection
AU - Yang, Alan
AU - Ghassami, Amir Emad
AU - Raginsky, Maxim
AU - Kiyavash, Negar
AU - Rosenbaum, Elyse
N1 - Funding Information:
This material is based on work supported by NSF CNS 16-24811 - CAEML and its industry members, NSF CCF 1704970, ONR grant W911NF-15-1-0479, and DARPA under the LwLL program.
Publisher Copyright:
© Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence, UAI 2020. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Markov blanket feature selection, while theoretically optimal, is generally challenging to implement. This is due to the shortcomings of existing approaches to conditional independence (CI) testing, which tend to struggle either with the curse of dimensionality or computational complexity. We propose a novel two-step approach which facilitates Markov blanket feature selection in high dimensions. First, neural networks are used to map features to low-dimensional representations. In the second step, CI testing is performed by applying the k-NN conditional mutual information estimator to the learned feature maps. The mappings are designed to ensure that mapped samples both preserve information and share similar information about the target variable if and only if they are close in Euclidean distance. We show that these properties boost the performance of the k-NN estimator in the second step. The performance of the proposed method is evaluated on both synthetic and real data.
AB - Markov blanket feature selection, while theoretically optimal, is generally challenging to implement. This is due to the shortcomings of existing approaches to conditional independence (CI) testing, which tend to struggle either with the curse of dimensionality or computational complexity. We propose a novel two-step approach which facilitates Markov blanket feature selection in high dimensions. First, neural networks are used to map features to low-dimensional representations. In the second step, CI testing is performed by applying the k-NN conditional mutual information estimator to the learned feature maps. The mappings are designed to ensure that mapped samples both preserve information and share similar information about the target variable if and only if they are close in Euclidean distance. We show that these properties boost the performance of the k-NN estimator in the second step. The performance of the proposed method is evaluated on both synthetic and real data.
UR - http://www.scopus.com/inward/record.url?scp=85101645442&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85101645442&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85101645442
SP - 1139
EP - 1148
T2 - 36th Conference on Uncertainty in Artificial Intelligence, UAI 2020
Y2 - 3 August 2020 through 6 August 2020
ER -