TY - GEN
T1 - Dynamic Truth Discovery on Numerical Data
AU - Zhi, Shi
AU - Yang, Fan
AU - Zhu, Zheyi
AU - Li, Qi
AU - Wang, Zhaoran
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/27
Y1 - 2018/12/27
N2 - Truth discovery aims at obtaining the most credible information from multiple sources that provide noisy and conflicting values. Due to the ubiquitous existence of data conflict in practice, truth discovery has been attracting a lot of research attention recently. Unfortunately, existing truth discovery models all miss an important issue of truth discovery - the truth evolution problem. In many real-life scenarios, the latent true value often keeps changing dynamically over time instead of staying static. We study the dynamic truth discovery problem in the space of numerical truth discovery. This problem cannot be addressed by existing models because of the new challenges of capturing time-evolving source dependency in a continuous space as well as handling missing data on the fly. We propose a model named EvolvT for dynamic truth discovery on numerical data. With the hidden Markov framework, EvolvT captures three key aspects of dynamic truth discovery with a unified model: truth transition regularity, source quality, and source dependency. The most distinguishable feature of the modeling part of EvolvT is that it employs Kalman filtering to model truth evolution. As such, EvolvT not only can principally infer source dependency in a continuous space, but also can handle missing data in a natural way. We establish an expectation-maximization (EM) algorithm for parameter inference of EvolvT and present an efficient online version for the parameter inference procedure. Our experiments on real-world applications demonstrate its advantages over the state-of-the-art truth discovery approaches.
AB - Truth discovery aims at obtaining the most credible information from multiple sources that provide noisy and conflicting values. Due to the ubiquitous existence of data conflict in practice, truth discovery has been attracting a lot of research attention recently. Unfortunately, existing truth discovery models all miss an important issue of truth discovery - the truth evolution problem. In many real-life scenarios, the latent true value often keeps changing dynamically over time instead of staying static. We study the dynamic truth discovery problem in the space of numerical truth discovery. This problem cannot be addressed by existing models because of the new challenges of capturing time-evolving source dependency in a continuous space as well as handling missing data on the fly. We propose a model named EvolvT for dynamic truth discovery on numerical data. With the hidden Markov framework, EvolvT captures three key aspects of dynamic truth discovery with a unified model: truth transition regularity, source quality, and source dependency. The most distinguishable feature of the modeling part of EvolvT is that it employs Kalman filtering to model truth evolution. As such, EvolvT not only can principally infer source dependency in a continuous space, but also can handle missing data in a natural way. We establish an expectation-maximization (EM) algorithm for parameter inference of EvolvT and present an efficient online version for the parameter inference procedure. Our experiments on real-world applications demonstrate its advantages over the state-of-the-art truth discovery approaches.
KW - Kalman filtering
KW - Streaming data
KW - Truth discovery
UR - http://www.scopus.com/inward/record.url?scp=85061366554&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061366554&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2018.00097
DO - 10.1109/ICDM.2018.00097
M3 - Conference contribution
AN - SCOPUS:85061366554
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 817
EP - 826
BT - 2018 IEEE International Conference on Data Mining, ICDM 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE International Conference on Data Mining, ICDM 2018
Y2 - 17 November 2018 through 20 November 2018
ER -