Abstract
Identity management is critical to various governmental practices ranging from providing citizens services to enforcing homeland security. The task of searching for a specific identity is difficult because multiple identity representations may exist due to issues related to unintentional errors and intentional deception. We propose a probabilistic Naïve Bayes model that improves existing identity matching techniques in terms of effectiveness. Experiments show that our proposed model performs significantly better than the exact-match based technique as well as the approximate-match based record comparison algorithm. In addition, our model greatly reduces the efforts of manually labeling training instances by employing a semi-supervised learning approach. This training method outperforms both fully supervised and unsupervised learning. With a training dataset that only contains 10% labeled instances, our model achieves a performance comparable to that of a fully supervised learning.
Original language | English (US) |
---|---|
Pages | 462-463 |
Number of pages | 2 |
DOIs | |
State | Published - 2006 |
Externally published | Yes |
Event | 7th Annual International Conference on Digital Government Research, Dg.o 2006 - San Diego, CA, United States Duration: May 21 2006 → May 24 2006 |
Other
Other | 7th Annual International Conference on Digital Government Research, Dg.o 2006 |
---|---|
Country/Territory | United States |
City | San Diego, CA |
Period | 5/21/06 → 5/24/06 |
Keywords
- Identity matching
- Naïve Bayes model
- Semi-supervised learning
ASJC Scopus subject areas
- Software
- Human-Computer Interaction
- Computer Vision and Pattern Recognition
- Computer Networks and Communications