TY - GEN
T1 - Automated classification and analysis of Internet malware
AU - Bailey, Michael
AU - Oberheide, Jon
AU - Andersen, Jon
AU - Mao, Z. Morley
AU - Jahanian, Farnam
AU - Nazario, Jose
PY - 2007
Y1 - 2007
N2 - Numerous attacks, such as worms, phishing, and botnets, threaten the availability of the Internet, the integrity of its hosts, and the privacy of its users. A core element of defense against these attacks is anti-virus (AV) software - a service that detects, removes, and characterizes these threats. The ability of these products to successfully characterize these threats has far-reaching effects-from facilitating sharing across organizations, to detecting the emergence of new threats, and assessing risk in quarantine and cleanup. In this paper, we examine the ability of existing host-based anti-virus products to provide semantically meaningful information about the malicious software and tools (or malware) used by attackers. Using a large, recent collection of malware that spans a variety of attack vectors (e.g., spyware, worms, spam), we show that different AV products characterize malware in ways that are inconsistent across AV products, incomplete across malware, and that fail to be concise in their semantics. To address these limitations, we propose a new classification technique that describes malware behavior in terms of system state changes (e.g., files written, processes created) rather than in sequences or patterns of system calls. To address the sheer volume of malware and diversity of its behavior, we provide a method for automatically categorizing these profiles of malware into groups that reflect similar classes of behaviors and demonstrate how behavior-based clustering provides a more direct and effective way of classifying and analyzing Internet malware.
AB - Numerous attacks, such as worms, phishing, and botnets, threaten the availability of the Internet, the integrity of its hosts, and the privacy of its users. A core element of defense against these attacks is anti-virus (AV) software - a service that detects, removes, and characterizes these threats. The ability of these products to successfully characterize these threats has far-reaching effects-from facilitating sharing across organizations, to detecting the emergence of new threats, and assessing risk in quarantine and cleanup. In this paper, we examine the ability of existing host-based anti-virus products to provide semantically meaningful information about the malicious software and tools (or malware) used by attackers. Using a large, recent collection of malware that spans a variety of attack vectors (e.g., spyware, worms, spam), we show that different AV products characterize malware in ways that are inconsistent across AV products, incomplete across malware, and that fail to be concise in their semantics. To address these limitations, we propose a new classification technique that describes malware behavior in terms of system state changes (e.g., files written, processes created) rather than in sequences or patterns of system calls. To address the sheer volume of malware and diversity of its behavior, we provide a method for automatically categorizing these profiles of malware into groups that reflect similar classes of behaviors and demonstrate how behavior-based clustering provides a more direct and effective way of classifying and analyzing Internet malware.
UR - http://www.scopus.com/inward/record.url?scp=38149089416&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38149089416&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-74320-0_10
DO - 10.1007/978-3-540-74320-0_10
M3 - Conference contribution
AN - SCOPUS:38149089416
SN - 9783540743194
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 178
EP - 197
BT - Recent Advances in Intrusion Detection - 10th International Symposium, RAID 2007, Proceedings
PB - Springer
T2 - 10th Symposium on Recent Advances in Intrusion Detection, RAID 2007
Y2 - 5 September 2007 through 7 September 2007
ER -