Private DNA Sequencing: Hiding Information in Discrete Noise

Kayvon Mazooji, Roy Dong, Ilan Shomorony

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

When an individual’s DNA is sequenced, sensitive medical information becomes available to the sequencing laboratory. A recently proposed way to hide an individual’s genetic information is to mix in DNA samples of other individuals. We assume these samples are known to the individual but unknown to the sequencing laboratory. Thus, these DNA samples act as “noise” to the sequencing laboratory, but still allow the individual to recover their own DNA samples afterward. Motivated by this idea, we study the problem of hiding a binary random variable X (a genetic marker) with the additive noise provided by mixing DNA samples, using mutual information as a privacy metric. This is equivalent to the problem of finding a worst-case noise distribution for recovering X from the noisy observation among a set of feasible discrete distributions. We characterize upper and lower bounds to the solution of this problem, which are empirically shown to be very close. The lower bound is obtained through a convex relaxation of the original discrete optimization problem, and yields a closed-form expression. The upper bound is computed via a greedy algorithm for selecting the mixing proportions.

Original languageEnglish (US)
Title of host publication2020 IEEE Information Theory Workshop, ITW 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728159621
DOIs
StatePublished - Apr 11 2021
Event2020 IEEE Information Theory Workshop, ITW 2020 - Virtual, Riva del Garda, Italy
Duration: Apr 11 2021Apr 15 2021

Publication series

Name2020 IEEE Information Theory Workshop, ITW 2020

Conference

Conference2020 IEEE Information Theory Workshop, ITW 2020
Country/TerritoryItaly
CityVirtual, Riva del Garda
Period4/11/214/15/21

Keywords

  • Additive discrete noise
  • DNA sequencing
  • Genetic privacy
  • Worst-case noise distribution

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Information Systems
  • Signal Processing
  • Software
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Private DNA Sequencing: Hiding Information in Discrete Noise'. Together they form a unique fingerprint.

Cite this