This study offers a first step toward understanding the extent to which we may be able to predict cyber security incidents (which can be of one of many types) by applying machine learning techniques and using externally observed malicious activities associated with network entities, including spamming, phishing, and scanning, each of which may or may not have direct bearing on a specific attack mechanism or incident type. Our hypothesis is that when viewed collectively, malicious activities originating from a network are indicative of the general cleanness of a network and how well it is run, and that furthermore, collectively they exhibit fairly stable and thus predictive behavior over time. To test this hypothesis, we utilize two datasets in this study: (1) a collection of commonly used IP address-based/host reputation blacklists (RBLs) collected over more than a year, and (2) a set of security incident reports collected over roughly the same period. Specifically, we first aggregate the RBL data at a prefix level and then introduce a set of features that capture the dynamics of this aggregated temporal process. A comparison between the distribution of these feature values taken from the incident dataset and from the general population of prefixes shows distinct differences, suggesting their value in distinguishing between the two while also highlighting the importance of capturing dynamic behavior (second order statistics) in the malicious activities. These features are then used to train a support vector machine (SVM) for prediction. Our preliminary results show that we can achieve reasonably good prediction performance over a forecasting window of a few months. Copyright is held by the owner/author(s).