SMURF: Efficient and Scalable Metadata Access for Distributed Applications from Edge to the Cloud

Bing Zhang, Tevfik Kosar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In parallel with big data processing and analysis dominating the usage of distributed and cloud infrastructures, the demand for distributed metadata access and transfer has increased. In many application domains, the volume of data generated exceeds petabytes, while the corresponding metadata amounts to terabytes or even more. In this paper, we propose a novel solution for efficient and scalable metadata access for distributed applications across wide-area networks, dubbed SMURF. Our solution combines novel pipelining and concurrent transfer mechanisms with reliability, provides distributed continuum caching and prefetching strategies to sidestep fetching latency, and achieves scalable and high-performance metadata fetch/prefetch services in the cloud. We also study the phenomenon of semantic locality in real trace logs which is not well utilized in metadata access prediction. We implement our predictor based on this observation and compare it with three existing state-of-the-art prefetch schemes on Yahoo! Hadoop audit traces. By effectively caching and prefetching metadata based on the access patterns, our continuum caching and prefetching mechanism greatly improves local cache hit rate and reduces the average fetching latency. We replayed approximately 20 Million metadata access operations from real audit traces, in which our system achieved 80% accuracy during prefetch prediction and reduced the average fetch latency 50% compared to the state-of-the-art mechanisms.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Edge Computing, EDGE 2019 - Part of the 2019 IEEE World Congress on Services
EditorsElisa Bertino, Carl K. Chang, Peter Chen, Ernesto Damiani, Michael Goul, Katsunori Oyama
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages102-106
Number of pages5
ISBN (Electronic)9781728127088
DOIs
StatePublished - Jul 2019
Event3rd IEEE International Conference on Edge Computing, EDGE 2019 - Milan, Italy
Duration: Jul 8 2019Jul 13 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Edge Computing, EDGE 2019 - Part of the 2019 IEEE World Congress on Services

Conference

Conference3rd IEEE International Conference on Edge Computing, EDGE 2019
Country/TerritoryItaly
CityMilan
Period7/8/197/13/19

Keywords

  • Metadata access
  • continuum caching
  • efficiency
  • prefetching
  • scalability
  • semantic locality

ASJC Scopus subject areas

  • Information Systems and Management
  • Computer Networks and Communications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'SMURF: Efficient and Scalable Metadata Access for Distributed Applications from Edge to the Cloud'. Together they form a unique fingerprint.

Cite this