Context-sensitive malicious spelling error correction

Hongyu Gong, Yuchen Li, Suma Bhat, Pramod Viswanath

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Misspelled words of the malicious kind work by changing specific keywords and are intended to thwart existing automated applications for cyber-environment control such as harassing content detection on the Internet and email spam detection. In this paper, we focus on malicious spelling correction, which requires an approach that relies on the context and the surface forms of targeted keywords. In the context of two applications-profanity detection and email spam detection-we show that malicious misspellings seriously degrade their performance. We then propose a context-sensitive approach for malicious spelling correction using word embeddings and demonstrate its superior performance compared to state-of-the-art spell checkers.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PublisherAssociation for Computing Machinery, Inc
Pages2771-2777
Number of pages7
ISBN (Electronic)9781450366748
DOIs
StatePublished - May 13 2019
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: May 13 2019May 17 2019

Publication series

NameThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
CountryUnited States
CitySan Francisco
Period5/13/195/17/19

Fingerprint

Electronic mail
Error correction
Internet

Keywords

  • Cyberbullying
  • Machine learning
  • Malicious spelling correction

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Cite this

Gong, H., Li, Y., Bhat, S., & Viswanath, P. (2019). Context-sensitive malicious spelling error correction. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (pp. 2771-2777). (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308558.3313431

Context-sensitive malicious spelling error correction. / Gong, Hongyu; Li, Yuchen; Bhat, Suma; Viswanath, Pramod.

The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. p. 2771-2777 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gong, H, Li, Y, Bhat, S & Viswanath, P 2019, Context-sensitive malicious spelling error correction. in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, Association for Computing Machinery, Inc, pp. 2771-2777, 2019 World Wide Web Conference, WWW 2019, San Francisco, United States, 5/13/19. https://doi.org/10.1145/3308558.3313431
Gong H, Li Y, Bhat S, Viswanath P. Context-sensitive malicious spelling error correction. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc. 2019. p. 2771-2777. (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). https://doi.org/10.1145/3308558.3313431
Gong, Hongyu ; Li, Yuchen ; Bhat, Suma ; Viswanath, Pramod. / Context-sensitive malicious spelling error correction. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. pp. 2771-2777 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).
@inproceedings{a1db5f97e88d4269b680455c8b021490,
title = "Context-sensitive malicious spelling error correction",
abstract = "Misspelled words of the malicious kind work by changing specific keywords and are intended to thwart existing automated applications for cyber-environment control such as harassing content detection on the Internet and email spam detection. In this paper, we focus on malicious spelling correction, which requires an approach that relies on the context and the surface forms of targeted keywords. In the context of two applications-profanity detection and email spam detection-we show that malicious misspellings seriously degrade their performance. We then propose a context-sensitive approach for malicious spelling correction using word embeddings and demonstrate its superior performance compared to state-of-the-art spell checkers.",
keywords = "Cyberbullying, Machine learning, Malicious spelling correction",
author = "Hongyu Gong and Yuchen Li and Suma Bhat and Pramod Viswanath",
year = "2019",
month = "5",
day = "13",
doi = "10.1145/3308558.3313431",
language = "English (US)",
series = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",
publisher = "Association for Computing Machinery, Inc",
pages = "2771--2777",
booktitle = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",

}

TY - GEN

T1 - Context-sensitive malicious spelling error correction

AU - Gong, Hongyu

AU - Li, Yuchen

AU - Bhat, Suma

AU - Viswanath, Pramod

PY - 2019/5/13

Y1 - 2019/5/13

N2 - Misspelled words of the malicious kind work by changing specific keywords and are intended to thwart existing automated applications for cyber-environment control such as harassing content detection on the Internet and email spam detection. In this paper, we focus on malicious spelling correction, which requires an approach that relies on the context and the surface forms of targeted keywords. In the context of two applications-profanity detection and email spam detection-we show that malicious misspellings seriously degrade their performance. We then propose a context-sensitive approach for malicious spelling correction using word embeddings and demonstrate its superior performance compared to state-of-the-art spell checkers.

AB - Misspelled words of the malicious kind work by changing specific keywords and are intended to thwart existing automated applications for cyber-environment control such as harassing content detection on the Internet and email spam detection. In this paper, we focus on malicious spelling correction, which requires an approach that relies on the context and the surface forms of targeted keywords. In the context of two applications-profanity detection and email spam detection-we show that malicious misspellings seriously degrade their performance. We then propose a context-sensitive approach for malicious spelling correction using word embeddings and demonstrate its superior performance compared to state-of-the-art spell checkers.

KW - Cyberbullying

KW - Machine learning

KW - Malicious spelling correction

UR - http://www.scopus.com/inward/record.url?scp=85066899983&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066899983&partnerID=8YFLogxK

U2 - 10.1145/3308558.3313431

DO - 10.1145/3308558.3313431

M3 - Conference contribution

AN - SCOPUS:85066899983

T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

SP - 2771

EP - 2777

BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

PB - Association for Computing Machinery, Inc

ER -