CouchFS: A high-performance file system for large data sets

Fangzhou Yao, Roy H. Campbell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Numerous file systems have been implemented to meet the needs in today's big data era, however many of them require specific configurations or frameworks for data processing. This paper presents CouchFS, a POSIX-compliant distributed file system for large data sets. We build CouchFS on top of CouchDB, which grants us flexibility to handle semistructured data. Since a database has similar behaviors as a file system, and CouchDB provides a high customizable MapReduce view for indexing, CouchFS is able to achieve high-performance searching for both text and supported binary objects. This work compares search of Wikipedia data using CouchDB, PostgreSQL and Spotlight on HFS+ file system. We show our design of CouchFS and discuss future approaches to improve this file system.

Original languageEnglish (US)
Title of host publicationProceedings - 2014 IEEE International Congress on Big Data, BigData Congress 2014
EditorsPeter Chen, Peter Chen, Hemant Jain
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages784-785
Number of pages2
ISBN (Electronic)9781479950577
DOIs
StatePublished - Sep 22 2014
Event3rd IEEE International Congress on Big Data, BigData Congress 2014 - Anchorage, United States
Duration: Jun 27 2014Jul 2 2014

Publication series

NameProceedings - 2014 IEEE International Congress on Big Data, BigData Congress 2014

Other

Other3rd IEEE International Congress on Big Data, BigData Congress 2014
Country/TerritoryUnited States
CityAnchorage
Period6/27/147/2/14

ASJC Scopus subject areas

  • Computer Science Applications

Fingerprint

Dive into the research topics of 'CouchFS: A high-performance file system for large data sets'. Together they form a unique fingerprint.

Cite this