TY - GEN
T1 - Latent Credibility Analysis
AU - Pasternack, Jeff
AU - Roth, Dan
PY - 2013/12/1
Y1 - 2013/12/1
N2 - A frequent problem when dealing with data gathered from multiple sources on the web (ranging from booksellers to Wikipedia pages to stock analyst predictions) is that these sources disagree, and we must decide which of their (often mutually exclusive) claims we should accept. Current state- of-the-art information credibility algorithms known as "fact-finders" are transitive voting systems with rules specifying how votes iteratively flow from sources to claims and then back to sources. While this is quite tractable and often effective, fact finders also suffer from substantial limitations; in particular, a lack of transparency obfuscates their credibility decisions and makes them difficult to adapt and analyze: knowing the mechanics of how votes are calculated does not readily tell us what those votes mean, and finding, for example, that a source has a score of 6 is not informative. We introduce a new approach to information credibility, Latent Credibility Analysis (LCA), constructing strongly principled, probabilistic models where the truth of each claim is a latent variable and the credibility of a source is captured by a set of model parameters. This gives LCA models clear semantics and modularity that make extending them to capture additional observed and latent credibility factors straightforward. Experiments over four real-world datasets demonstrate that LCA models can outperform the best fact finders in both unsupervised and semi-supervised settings. Copyright is held by the International World Wide Web Conference Committee (IW3C2).
AB - A frequent problem when dealing with data gathered from multiple sources on the web (ranging from booksellers to Wikipedia pages to stock analyst predictions) is that these sources disagree, and we must decide which of their (often mutually exclusive) claims we should accept. Current state- of-the-art information credibility algorithms known as "fact-finders" are transitive voting systems with rules specifying how votes iteratively flow from sources to claims and then back to sources. While this is quite tractable and often effective, fact finders also suffer from substantial limitations; in particular, a lack of transparency obfuscates their credibility decisions and makes them difficult to adapt and analyze: knowing the mechanics of how votes are calculated does not readily tell us what those votes mean, and finding, for example, that a source has a score of 6 is not informative. We introduce a new approach to information credibility, Latent Credibility Analysis (LCA), constructing strongly principled, probabilistic models where the truth of each claim is a latent variable and the credibility of a source is captured by a set of model parameters. This gives LCA models clear semantics and modularity that make extending them to capture additional observed and latent credibility factors straightforward. Experiments over four real-world datasets demonstrate that LCA models can outperform the best fact finders in both unsupervised and semi-supervised settings. Copyright is held by the International World Wide Web Conference Committee (IW3C2).
KW - Credibility
KW - Graphical Models
KW - Trust
KW - Veracity
UR - http://www.scopus.com/inward/record.url?scp=84893040152&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893040152&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84893040152
SN - 9781450320351
T3 - WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web
SP - 1009
EP - 1019
BT - WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web
T2 - 22nd International Conference on World Wide Web, WWW 2013
Y2 - 13 May 2013 through 17 May 2013
ER -