Equivocal URLs: Understanding the Fragmented Space of URL Parser Implementations

Joshua Reynolds, Adam Bates, Michael Bailey

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Uniform Resource Locators (URLs) are integral to the Web and have existed for nearly three decades. Yet URL parsing differs subtly among parser implementations, leading to ambiguity that can be abused by attackers. We measure agreement between widely-used URL parsers and find that each has made design decisions that deviate from parsing standards, creating a fractured implementation space where assumptions of uniform interpretation are unreliable. In some cases, deviations are severe enough that clients using different parsers will make requests to different hosts based on a single, “equivocal” URL. We systematize the thousands of differences we observed into seven pitfalls in URL parsing that application developers should beware of. We demonstrate that this ambiguity can be weaponized through misdirection attacks that evade the Google Safe Browsing and VirusTotal URL classifiers. URL parsing libraries have made a tradeoff to favor permissiveness over strict standards adherence. We hope this work will motivate the systemic adoption of a more unified URL parsing standard–enabling a more secure Web.

Original languageEnglish (US)
Title of host publicationComputer Security – ESORICS 2022 - 27th European Symposium on Research in Computer Security, Proceedings
EditorsVijayalakshmi Atluri, Roberto Di Pietro, Christian D. Jensen, Weizhi Meng
PublisherSpringer
Pages166-185
Number of pages20
ISBN (Print)9783031171420
DOIs
StatePublished - 2022
Event27th European Symposium on Research in Computer Security, ESORICS 2022 - Virtual, Online
Duration: Sep 26 2022Sep 30 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13556 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th European Symposium on Research in Computer Security, ESORICS 2022
CityVirtual, Online
Period9/26/229/30/22

Keywords

  • Parsing ambiguity
  • URL
  • Web security

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Equivocal URLs: Understanding the Fragmented Space of URL Parser Implementations'. Together they form a unique fingerprint.

Cite this