Recent attacks show that threats to cyber infrastructureare not only increasing in volume, but are getting moresophisticated. The attacks may comprise multiple actions that arehard to differentiate from benign activity, and therefore commondetection techniques have to deal with high false positive rates. Because of the imperfect performance of automated detectiontechniques, responses to such attacks are highly dependent onhuman-driven decision-making processes. While game theory hasbeen applied to many problems that require rational decisionmaking, we find limitation on applying such method on securitygames when the defender has limited information about theopponent's strategies and payoffs. In this work, we propose Q-Learning to react automatically to the adversarial behavior ofa suspicious user to secure the system. This work comparesvariations of Q-Learning with a traditional stochastic game. Simulation results show the possibility of Naive Q-Learning, despite restricted information on opponents.