Many novel applications have been built based on analyzing tweets about specific topics. While these applications provide different kinds of analysis, they share a common task of monitoring "target" tweets from the Twitter stream for a topic. The current solution for this task tracks a set of manually selected keywords with Twitter APIs. Obviously, this manual approach has many limitations. In this paper, we propose a data platform to automatically monitor target tweets from the Twitter stream for any given topic. To monitor target tweets in an optimal and continuous way, we design Automatic Topic-focused Monitor (ATM), which iteratively 1) samples tweets from the stream and 2) selects keywords to track based on the samples. To realize ATM, we develop a tweet sampling algorithm to sample sufficient unbiased tweets with available Twitter APIs, and a keyword selection algorithm to efficiently select keywords that have a near-optimal coverage of target tweets under cost constraints. We conduct extensive experiments to show the effectiveness of ATM. E.g., ATM covers 90% of target tweets for a topic and improves the manual approach by 49%.
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Computer Science(all)