TY - GEN
T1 - A multi-layer Naïve Bayes model for approximate identity matching
AU - Wang, G. Alan
AU - Chen, Hsinchun
AU - Atabakhsh, Homa
PY - 2006
Y1 - 2006
N2 - Identity management is critical to various governmental practices ranging from providing citizens services to enforcing homeland security. The task of searching for a specific identity is difficult because multiple identity representations may exist due to issues related to unintentional errors and intentional deception. We propose a Naïve Bayes identity matching model that improves existing techniques in terms of effectiveness. Experiments show that our proposed model performs significantly better than the exact-match based technique and achieves higher precision than the record comparison technique, In addition, our model greatly reduces the efforts of manually labeling training instances by employing a semi-supervised learning approach. This training method outperforms both fully supervised and unsupervised learning. With a training dataset that only contains 30% labeled instances, our model achieves a performance comparable to that of a fully supervised learning.
AB - Identity management is critical to various governmental practices ranging from providing citizens services to enforcing homeland security. The task of searching for a specific identity is difficult because multiple identity representations may exist due to issues related to unintentional errors and intentional deception. We propose a Naïve Bayes identity matching model that improves existing techniques in terms of effectiveness. Experiments show that our proposed model performs significantly better than the exact-match based technique and achieves higher precision than the record comparison technique, In addition, our model greatly reduces the efforts of manually labeling training instances by employing a semi-supervised learning approach. This training method outperforms both fully supervised and unsupervised learning. With a training dataset that only contains 30% labeled instances, our model achieves a performance comparable to that of a fully supervised learning.
UR - http://www.scopus.com/inward/record.url?scp=33745875307&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745875307&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33745875307
SN - 3540344780
SN - 9783540344780
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 479
EP - 484
BT - Intelligence and Security Informatics - IEEE International Conference on Intelligence and Security Informatics, ISI 2006, Proceedings
PB - Springer
T2 - IEEE International Conference on Intelligence and Security Informatics, ISI 2006
Y2 - 23 May 2006 through 24 May 2006
ER -