Ambry: LinkedIn's scalable geo-distributed object store

Shadi A. Noghabi, Sriram Subramanian, Priyesh Narayanan, Sivabalan Narayanan, Gopalakrishna Holla, Mammad Zadeh, Tianwei Li, Indranil Gupta, Roy H. Campbell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos, videos, and audio clips. These objects must be stored and served with low latency and high through- put by a system that is geo-distributed, highly scalable, and load-balanced. Existing file systems and object stores face several challenges when serving such large objects. We present Ambry, a production-quality system for storing large immutable data (called blobs). Ambry is designed in a decentralized way and leverages techniques such as logical blob grouping, asynchronous replication, rebalancing mechanisms, zero-cost failure detection, and OS caching. Ambry has been running in LinkedIn's production environment for the past 2 years, serving up to 10K requests per second across more than 400 million users. Our experimental evaluation reveals that Ambry offers high efficiency (utilizing up to 88% of the network bandwidth), low latency (less than 50 ms latency for a 1 MB object), and load balancing (improving imbalance of request rate among disks by 8x-10x).

Original languageEnglish (US)
Title of host publicationSIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages253-265
Number of pages13
ISBN (Electronic)9781450335317
DOIs
StatePublished - Jun 26 2016
Event2016 ACM SIGMOD International Conference on Management of Data, SIGMOD 2016 - San Francisco, United States
Duration: Jun 26 2016Jul 1 2016

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
Volume26-June-2016
ISSN (Print)0730-8078

Other

Other2016 ACM SIGMOD International Conference on Management of Data, SIGMOD 2016
CountryUnited States
CitySan Francisco
Period6/26/167/1/16

Fingerprint

Resource allocation
Bandwidth
Costs

Keywords

  • Geographically distributed
  • Load balancing
  • Object store
  • Scalable

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Noghabi, S. A., Subramanian, S., Narayanan, P., Narayanan, S., Holla, G., Zadeh, M., ... Campbell, R. H. (2016). Ambry: LinkedIn's scalable geo-distributed object store. In SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data (pp. 253-265). (Proceedings of the ACM SIGMOD International Conference on Management of Data; Vol. 26-June-2016). Association for Computing Machinery. https://doi.org/10.1145/2882903.2903738

Ambry : LinkedIn's scalable geo-distributed object store. / Noghabi, Shadi A.; Subramanian, Sriram; Narayanan, Priyesh; Narayanan, Sivabalan; Holla, Gopalakrishna; Zadeh, Mammad; Li, Tianwei; Gupta, Indranil; Campbell, Roy H.

SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data. Association for Computing Machinery, 2016. p. 253-265 (Proceedings of the ACM SIGMOD International Conference on Management of Data; Vol. 26-June-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Noghabi, SA, Subramanian, S, Narayanan, P, Narayanan, S, Holla, G, Zadeh, M, Li, T, Gupta, I & Campbell, RH 2016, Ambry: LinkedIn's scalable geo-distributed object store. in SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data. Proceedings of the ACM SIGMOD International Conference on Management of Data, vol. 26-June-2016, Association for Computing Machinery, pp. 253-265, 2016 ACM SIGMOD International Conference on Management of Data, SIGMOD 2016, San Francisco, United States, 6/26/16. https://doi.org/10.1145/2882903.2903738
Noghabi SA, Subramanian S, Narayanan P, Narayanan S, Holla G, Zadeh M et al. Ambry: LinkedIn's scalable geo-distributed object store. In SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data. Association for Computing Machinery. 2016. p. 253-265. (Proceedings of the ACM SIGMOD International Conference on Management of Data). https://doi.org/10.1145/2882903.2903738
Noghabi, Shadi A. ; Subramanian, Sriram ; Narayanan, Priyesh ; Narayanan, Sivabalan ; Holla, Gopalakrishna ; Zadeh, Mammad ; Li, Tianwei ; Gupta, Indranil ; Campbell, Roy H. / Ambry : LinkedIn's scalable geo-distributed object store. SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data. Association for Computing Machinery, 2016. pp. 253-265 (Proceedings of the ACM SIGMOD International Conference on Management of Data).
@inproceedings{b0fd2738a48e4dc9a78ee5693943bf32,
title = "Ambry: LinkedIn's scalable geo-distributed object store",
abstract = "The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos, videos, and audio clips. These objects must be stored and served with low latency and high through- put by a system that is geo-distributed, highly scalable, and load-balanced. Existing file systems and object stores face several challenges when serving such large objects. We present Ambry, a production-quality system for storing large immutable data (called blobs). Ambry is designed in a decentralized way and leverages techniques such as logical blob grouping, asynchronous replication, rebalancing mechanisms, zero-cost failure detection, and OS caching. Ambry has been running in LinkedIn's production environment for the past 2 years, serving up to 10K requests per second across more than 400 million users. Our experimental evaluation reveals that Ambry offers high efficiency (utilizing up to 88{\%} of the network bandwidth), low latency (less than 50 ms latency for a 1 MB object), and load balancing (improving imbalance of request rate among disks by 8x-10x).",
keywords = "Geographically distributed, Load balancing, Object store, Scalable",
author = "Noghabi, {Shadi A.} and Sriram Subramanian and Priyesh Narayanan and Sivabalan Narayanan and Gopalakrishna Holla and Mammad Zadeh and Tianwei Li and Indranil Gupta and Campbell, {Roy H.}",
year = "2016",
month = "6",
day = "26",
doi = "10.1145/2882903.2903738",
language = "English (US)",
series = "Proceedings of the ACM SIGMOD International Conference on Management of Data",
publisher = "Association for Computing Machinery",
pages = "253--265",
booktitle = "SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data",

}

TY - GEN

T1 - Ambry

T2 - LinkedIn's scalable geo-distributed object store

AU - Noghabi, Shadi A.

AU - Subramanian, Sriram

AU - Narayanan, Priyesh

AU - Narayanan, Sivabalan

AU - Holla, Gopalakrishna

AU - Zadeh, Mammad

AU - Li, Tianwei

AU - Gupta, Indranil

AU - Campbell, Roy H.

PY - 2016/6/26

Y1 - 2016/6/26

N2 - The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos, videos, and audio clips. These objects must be stored and served with low latency and high through- put by a system that is geo-distributed, highly scalable, and load-balanced. Existing file systems and object stores face several challenges when serving such large objects. We present Ambry, a production-quality system for storing large immutable data (called blobs). Ambry is designed in a decentralized way and leverages techniques such as logical blob grouping, asynchronous replication, rebalancing mechanisms, zero-cost failure detection, and OS caching. Ambry has been running in LinkedIn's production environment for the past 2 years, serving up to 10K requests per second across more than 400 million users. Our experimental evaluation reveals that Ambry offers high efficiency (utilizing up to 88% of the network bandwidth), low latency (less than 50 ms latency for a 1 MB object), and load balancing (improving imbalance of request rate among disks by 8x-10x).

AB - The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos, videos, and audio clips. These objects must be stored and served with low latency and high through- put by a system that is geo-distributed, highly scalable, and load-balanced. Existing file systems and object stores face several challenges when serving such large objects. We present Ambry, a production-quality system for storing large immutable data (called blobs). Ambry is designed in a decentralized way and leverages techniques such as logical blob grouping, asynchronous replication, rebalancing mechanisms, zero-cost failure detection, and OS caching. Ambry has been running in LinkedIn's production environment for the past 2 years, serving up to 10K requests per second across more than 400 million users. Our experimental evaluation reveals that Ambry offers high efficiency (utilizing up to 88% of the network bandwidth), low latency (less than 50 ms latency for a 1 MB object), and load balancing (improving imbalance of request rate among disks by 8x-10x).

KW - Geographically distributed

KW - Load balancing

KW - Object store

KW - Scalable

UR - http://www.scopus.com/inward/record.url?scp=84979681342&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979681342&partnerID=8YFLogxK

U2 - 10.1145/2882903.2903738

DO - 10.1145/2882903.2903738

M3 - Conference contribution

AN - SCOPUS:84979681342

T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data

SP - 253

EP - 265

BT - SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data

PB - Association for Computing Machinery

ER -