TY - GEN
T1 - Cloudy with a chance of breach
T2 - 24th USENIX Security Symposium
AU - Liu, Yang
AU - Sarabi, Armin
AU - Zhang, Jing
AU - Naghizadeh, Parinaz
AU - Karir, Manish
AU - Bailey, Michael
AU - Liu, Mingyan
N1 - Funding Information:
This work is partially supported by the NSF under grant CNS 1422211, CNS 1409758, CNS 1111699 and by the DHS under contract number HSHQDC-13-C-B0015.
Publisher Copyright:
© 2015 Proceedings of the 24th USENIX Security Symposium. All rights reserved.
PY - 2015
Y1 - 2015
N2 - In this study we characterize the extent to which cyber security incidents, such as those referenced by Verizon in its annual Data Breach Investigations Reports (DBIR), can be predicted based on externally observable properties of an organization’s network. We seek to proactively forecast an organization’s breaches and to do so without cooperation of the organization itself. To accomplish this goal, we collect 258 externally measurable features about an organization’s network from two main categories: mismanagement symptoms, such as misconfigured DNS or BGP within a network, and malicious activity time series, which include spam, phishing, and scanning activity sourced from these organizations. Using these features we train and test a Random Forest (RF) classifier against more than 1,000 incident reports taken from the VERIS community database, Hackmageddon, and the Web Hacking Incidents Database that cover events from mid-2013 to the end of 2014. The resulting classifier is able to achieve a 90% True Positive (TP) rate, a 10% False Positive (FP) rate, and an overall 90% accuracy.
AB - In this study we characterize the extent to which cyber security incidents, such as those referenced by Verizon in its annual Data Breach Investigations Reports (DBIR), can be predicted based on externally observable properties of an organization’s network. We seek to proactively forecast an organization’s breaches and to do so without cooperation of the organization itself. To accomplish this goal, we collect 258 externally measurable features about an organization’s network from two main categories: mismanagement symptoms, such as misconfigured DNS or BGP within a network, and malicious activity time series, which include spam, phishing, and scanning activity sourced from these organizations. Using these features we train and test a Random Forest (RF) classifier against more than 1,000 incident reports taken from the VERIS community database, Hackmageddon, and the Web Hacking Incidents Database that cover events from mid-2013 to the end of 2014. The resulting classifier is able to achieve a 90% True Positive (TP) rate, a 10% False Positive (FP) rate, and an overall 90% accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85076274623&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076274623&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85076274623
T3 - Proceedings of the 24th USENIX Security Symposium
SP - 1009
EP - 1024
BT - Proceedings of the 24th USENIX Security Symposium
PB - USENIX Association
Y2 - 12 August 2015 through 14 August 2015
ER -