TY - GEN
T1 - Detecting multiple periods and periodic patterns in event time sequences
AU - Yuan, Quan
AU - Shang, Jingbo
AU - Cao, Xin
AU - Zhang, Chao
AU - Geng, Xinhe
AU - Han, Jiawei
N1 - Funding Information:
7 ACKNOWLEDGEMENTS is work was sponsored by the U.S. Army Research Lab. under Cooperative Agreement No. W911NF-09-2-0053 (NSCTA), National Science Foundation IIS-1017362, IIS-1320617, and IIS-1354329, HDTRA1-10-1-0120, and Grant 1U54GM114838 awarded by NIGMS through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (www.bd2k.nih.gov), and MIAS, a DHS-IDS Center for Multimodal Information Access and Synthesis at UIUC. e views and conclusions contained in this document are those of the authors and should not be interpreted as representing any funding agencies. REFERENCES [1] Tide, from wikipedia. hps://en.wikipedia.org/wiki/Tide. [2] M. Ahdesmäki, H. Lähdesmäki, A. Gracey, and O. Yli-Harja. Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data. BMC bioinformatics, 8(1):233, 2007. [3] J. Barkley Rosser and L. Schoenfeld. Approximate formulas for some functions of prime numbers. Illinois Journal of Mathematics, 6:6z–9z, 1962. [4] C. Berberidis, W. G. Aref, M. Atallah, I. Vlahavas, and A. K. Elmagarmid. Multiple and partial periodicity mining in time series databases. In ECAI, volume 2, pages 370–374, 2002. [5] E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based social networks. In Proc. KDD, pages 1082–1090. ACM, 2011. [6] R. B. Cleveland, W. S. Cleveland, and I. Terpenning. Stl: A seasonal-trend decomposition procedure based on loess. Journal of Ocial Statistics, 6(1):3, 1990. [7] C. Ding, D. Pei, and A. Salomaa. Chinese remainder theorem: applications in computing, coding, cryptography. World Scientic, 1996.
Publisher Copyright:
© 2017 ACM.
PY - 2017/11/6
Y1 - 2017/11/6
N2 - Periodicity is prevalent in physical world, and many events involve more than one periods, e.g., individual's mobility, tide paern, and massive transportation utilization. Knowing the true periods of events can benefit a number of applications, such as traffic prediction, time-aware recommendation and advertisement, and anomaly detection. However, detecting multiple periods is a very challenging task due to not only the interwoven periodic patterns but also the low quality of event tracking records. In this paper, we study the problem of discovering all true periods and the corresponded occurring patterns of an event from a noisy and incomplete observation sequence. We devise a novel scoring function, by maximizing which we can identify the true periodic patterns involved in the sequence. We prove that, however, optimizing the objective function is an NP-hard problem. To address this challenge, we develop a heuristic algorithm named Timeslot Coverage Model (TiCom), for identifying the periods and periodic patterns approximately. The results of extensive experiments on both synthetic and reallife datasets show that our model outperforms the state-of-the-art baselines significantly in various tasks, including period detection, periodic pattern identification, and anomaly detection.
AB - Periodicity is prevalent in physical world, and many events involve more than one periods, e.g., individual's mobility, tide paern, and massive transportation utilization. Knowing the true periods of events can benefit a number of applications, such as traffic prediction, time-aware recommendation and advertisement, and anomaly detection. However, detecting multiple periods is a very challenging task due to not only the interwoven periodic patterns but also the low quality of event tracking records. In this paper, we study the problem of discovering all true periods and the corresponded occurring patterns of an event from a noisy and incomplete observation sequence. We devise a novel scoring function, by maximizing which we can identify the true periodic patterns involved in the sequence. We prove that, however, optimizing the objective function is an NP-hard problem. To address this challenge, we develop a heuristic algorithm named Timeslot Coverage Model (TiCom), for identifying the periods and periodic patterns approximately. The results of extensive experiments on both synthetic and reallife datasets show that our model outperforms the state-of-the-art baselines significantly in various tasks, including period detection, periodic pattern identification, and anomaly detection.
UR - http://www.scopus.com/inward/record.url?scp=85037333803&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85037333803&partnerID=8YFLogxK
U2 - 10.1145/3132847.3133027
DO - 10.1145/3132847.3133027
M3 - Conference contribution
AN - SCOPUS:85037333803
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 617
EP - 626
BT - CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 26th ACM International Conference on Information and Knowledge Management, CIKM 2017
Y2 - 6 November 2017 through 10 November 2017
ER -