TY - GEN
T1 - Hiding sensitive itemsets in shared transactional databases
T2 - 40th International Conference on Information Systems, ICIS 2019
AU - Menon, Syam
AU - Ghoshal, Abhijeet
AU - Sarkar, Sumit
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Firms have been sharing transactional data with business partners for decades. The benefits of sharing not withstanding, a significant number of firms are still reluctant to share, for fear of sensitive information getting into the wrong hands. Consequently, effective ways to hide sensitive information before sharing data has become an important consideration. As most versions of the underlying problem are NP-hard, and as transactional databases involve millions of transactions, research on this topic has also been ongoing for many years. In this paper, we explore hiding sensitive itemsets while minimizing the number of items removed from the database to achieve it. We present an integer programming formulation, and show that while the general problem is NP-hard, there is underlying structure in the problem that allows it to be decomposed and solved more efficiently. We illustrate various aspects of the problem using examples, and present a solution procedure based on propositions developed in the paper. This is research in progress, and we intend to implement and illustrate the effectiveness of the proposed procedure on large, real and synthetic databases involving millions of transactions.
AB - Firms have been sharing transactional data with business partners for decades. The benefits of sharing not withstanding, a significant number of firms are still reluctant to share, for fear of sensitive information getting into the wrong hands. Consequently, effective ways to hide sensitive information before sharing data has become an important consideration. As most versions of the underlying problem are NP-hard, and as transactional databases involve millions of transactions, research on this topic has also been ongoing for many years. In this paper, we explore hiding sensitive itemsets while minimizing the number of items removed from the database to achieve it. We present an integer programming formulation, and show that while the general problem is NP-hard, there is underlying structure in the problem that allows it to be decomposed and solved more efficiently. We illustrate various aspects of the problem using examples, and present a solution procedure based on propositions developed in the paper. This is research in progress, and we intend to implement and illustrate the effectiveness of the proposed procedure on large, real and synthetic databases involving millions of transactions.
KW - Accuracy
KW - Integer programming
KW - Privacy
UR - http://www.scopus.com/inward/record.url?scp=85082299667&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082299667&partnerID=8YFLogxK
M3 - Conference contribution
T3 - 40th International Conference on Information Systems, ICIS 2019
BT - 40th International Conference on Information Systems, ICIS 2019
PB - Association for Information Systems
Y2 - 15 December 2019 through 18 December 2019
ER -