Featureselector: An XSEDE-enabled tool for massive game log analysis

Y. Dora Cai, Bettina Cassandra Riedl, Rabindra Robby Ratan, Cuihua Shen, Arnold Picot

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Due to the huge volume and extreme complexity in online game data collections, selecting essential features for the analysis of massive game logs is not only necessary, but also challenging. This study develops and implements a new XSEDE-enabled tool, FeatureSelector, which uses the parallel processing techniques on high performance computers to perform feature selection. By calculating probability distance measures, based on K-L divergence, this tool quantifies the distance between variables in data sets, and provides guidance for feature selection in massive game log analysis. This tool has helped researchers choose the high-quality and discriminative features from over 300 variables, and select the top pairs of countries with the greatest differences from 231 country-pairs in a 500 GB game log data set. Our study shows that (1) K-L divergence is a good measure for correctly and efficiently selecting important features, and (2) the high performance computing platform supported by XSEDE has substantially accelerated the feature selection processes by over 30 times. Besides demonstrating the effectiveness of FeatureSelector in a cross-country analysis using high performance computing, this study also highlights some lessons learned for feature selection in social science research and some experience on applying parallel processing techniques in intensive data analysis.

Original languageEnglish (US)
Title of host publicationProceedings of the XSEDE 2014 Conference
Subtitle of host publicationEngaging Communities
PublisherAssociation for Computing Machinery (ACM)
ISBN (Print)9781450328937
StatePublished - Jul 1 2014
Event2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014 - Atlanta, GA, United States
Duration: Jul 13 2014Jul 18 2014

Publication series

NameACM International Conference Proceeding Series


Other2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014
CountryUnited States
CityAtlanta, GA


  • Feature selection
  • Game log analysis
  • K-L divergence
  • Massively multiplayer online games

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Featureselector: An XSEDE-enabled tool for massive game log analysis'. Together they form a unique fingerprint.

Cite this