A study of feature construction for text-based forecasting of time series variables

Yiren Wang, Dominic Seyler, Shubhra Kanti Karmaker Santu, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Time series are ubiquitous in the world since they are used to measure various phenomena (e.g., temperature, spread of a virus, sales, etc.). Forecasting of time series is highly beneficial (and necessary) for optimizing decisions, yet is a very challenging problem; using only the historical values of the time series is often insufficient. In this paper, we study how to construct effective additional features based on related text data for time series forecasting. Besides the commonly used n-gram features, we propose a general strategy for constructing multiple topical features based on the topics discovered by a topic model. We evaluate feature effectiveness using a data set for predicting stock price changes where we constructed additional features from news text articles for stock market prediction. We found that: 1) Text-based features outperform time series-based features, suggesting the great promise of leveraging text data for improving time series forecasting. 2) Topic-based features are not very effective stand-alone, but they can further improve performance when added on top of n-gram features. 3) The best topic-based feature appears to be a long-term aggregation of topics over time with high weights on recent topics.

Original languageEnglish (US)
Title of host publicationCIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages2347-2350
Number of pages4
ISBN (Electronic)9781450349185
DOIs
StatePublished - Nov 6 2017
Event26th ACM International Conference on Information and Knowledge Management, CIKM 2017 - Singapore, Singapore
Duration: Nov 6 2017Nov 10 2017

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
VolumePart F131841

Other

Other26th ACM International Conference on Information and Knowledge Management, CIKM 2017
Country/TerritorySingapore
CitySingapore
Period11/6/1711/10/17

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Fingerprint

Dive into the research topics of 'A study of feature construction for text-based forecasting of time series variables'. Together they form a unique fingerprint.

Cite this