TY - GEN
T1 - Equivocal URLs
T2 - 27th European Symposium on Research in Computer Security, ESORICS 2022
AU - Reynolds, Joshua
AU - Bates, Adam
AU - Bailey, Michael Donald
N1 - Funding Information:
Acknowledgements. This work was partially supported by the NSF under grants GR0005987 and CNS 1955228. We thank our anonymous peer reviewers as well as Zane Ma, Joshua Mason, Kent Seamons, Jay Misra, Kaylia M. Reynolds, Deepak Kumar, and Paul Murley for their feedback and suggestions.
Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Uniform Resource Locators (URLs) are integral to the Web and have existed for nearly three decades. Yet URL parsing differs subtly among parser implementations, leading to ambiguity that can be abused by attackers. We measure agreement between widely-used URL parsers and find that each has made design decisions that deviate from parsing standards, creating a fractured implementation space where assumptions of uniform interpretation are unreliable. In some cases, deviations are severe enough that clients using different parsers will make requests to different hosts based on a single, “equivocal” URL. We systematize the thousands of differences we observed into seven pitfalls in URL parsing that application developers should beware of. We demonstrate that this ambiguity can be weaponized through misdirection attacks that evade the Google Safe Browsing and VirusTotal URL classifiers. URL parsing libraries have made a tradeoff to favor permissiveness over strict standards adherence. We hope this work will motivate the systemic adoption of a more unified URL parsing standard–enabling a more secure Web.
AB - Uniform Resource Locators (URLs) are integral to the Web and have existed for nearly three decades. Yet URL parsing differs subtly among parser implementations, leading to ambiguity that can be abused by attackers. We measure agreement between widely-used URL parsers and find that each has made design decisions that deviate from parsing standards, creating a fractured implementation space where assumptions of uniform interpretation are unreliable. In some cases, deviations are severe enough that clients using different parsers will make requests to different hosts based on a single, “equivocal” URL. We systematize the thousands of differences we observed into seven pitfalls in URL parsing that application developers should beware of. We demonstrate that this ambiguity can be weaponized through misdirection attacks that evade the Google Safe Browsing and VirusTotal URL classifiers. URL parsing libraries have made a tradeoff to favor permissiveness over strict standards adherence. We hope this work will motivate the systemic adoption of a more unified URL parsing standard–enabling a more secure Web.
KW - Parsing ambiguity
KW - URL
KW - Web security
UR - http://www.scopus.com/inward/record.url?scp=85140740323&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85140740323&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-17143-7_9
DO - 10.1007/978-3-031-17143-7_9
M3 - Conference contribution
AN - SCOPUS:85140740323
SN - 9783031171420
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 166
EP - 185
BT - Computer Security – ESORICS 2022 - 27th European Symposium on Research in Computer Security, Proceedings
A2 - Atluri, Vijayalakshmi
A2 - Di Pietro, Roberto
A2 - Jensen, Christian D.
A2 - Meng, Weizhi
PB - Springer
Y2 - 26 September 2022 through 30 September 2022
ER -