TY - GEN
T1 - Mining compressed commodity workflows from massive RFID data sets
AU - Gonzalez, Hector
AU - Han, Jiawei
AU - Li, Xiaolei
PY - 2006
Y1 - 2006
N2 - Radio Frequency Identification (RFID) technology is fast becoming a prevalent tool in tracking commodities in supply chain management applications. The movement of commodities through the supply chain forms a gigantic workflow that can be mined for the discovery of trends, flow correlations and outlier paths, that in turn can be valuable in understanding and optimizing business processes.In this paper, we propose a method to construct compressed probabilistic workflows that capture the movement trends and significant exceptions of the overall data sets, but with a size that is substantially smaller than that of the complete RFID workflow. Compression is achieved based on the following observations: (1) only a relatively small minority of items deviate from the general trend, (2)only truly non-redundant deviations, ie, those that substantially deviate from the previously recorded ones, are interesting, and (3) although RFID data is registered at the primitive level, data analysis usually takes place at a higher abstraction level. Techniques for workflow compression based on non-redundant transition and emission probabilities are derived; and an algorithm for computing approximate path probabilities is developed. Our experiments demonstrate the utility and feasibility of our design, data structure, and algorithms.
AB - Radio Frequency Identification (RFID) technology is fast becoming a prevalent tool in tracking commodities in supply chain management applications. The movement of commodities through the supply chain forms a gigantic workflow that can be mined for the discovery of trends, flow correlations and outlier paths, that in turn can be valuable in understanding and optimizing business processes.In this paper, we propose a method to construct compressed probabilistic workflows that capture the movement trends and significant exceptions of the overall data sets, but with a size that is substantially smaller than that of the complete RFID workflow. Compression is achieved based on the following observations: (1) only a relatively small minority of items deviate from the general trend, (2)only truly non-redundant deviations, ie, those that substantially deviate from the previously recorded ones, are interesting, and (3) although RFID data is registered at the primitive level, data analysis usually takes place at a higher abstraction level. Techniques for workflow compression based on non-redundant transition and emission probabilities are derived; and an algorithm for computing approximate path probabilities is developed. Our experiments demonstrate the utility and feasibility of our design, data structure, and algorithms.
KW - RFID
KW - Workflow induction
UR - http://www.scopus.com/inward/record.url?scp=34547617685&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547617685&partnerID=8YFLogxK
U2 - 10.1145/1183614.1183641
DO - 10.1145/1183614.1183641
M3 - Conference contribution
AN - SCOPUS:34547617685
SN - 1595934332
SN - 9781595934338
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 162
EP - 171
BT - Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006
T2 - 15th ACM Conference on Information and Knowledge Management, CIKM 2006
Y2 - 6 November 2006 through 11 November 2006
ER -