TY - GEN
T1 - Disability-First Design and Creation of A Dataset Showing Private Visual Information Collected With People Who Are Blind
AU - Sharma, Tanusree
AU - Stangl, Abigale
AU - Zhang, Lotus
AU - Tseng, Yu Yun
AU - Xu, Inan
AU - Findlater, Leah
AU - Gurari, Danna
AU - Wang, Yang
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/4/19
Y1 - 2023/4/19
N2 - We present the design and creation of a disability-first dataset, "BIV-Priv,"which contains 728 images and 728 videos of 14 private categories captured by 26 blind participants to support downstream development of artificial intelligence (AI) models. While best practices in dataset creation typically attempt to eliminate private content, some applications require such content for model development. We describe our approach in creating this dataset with private content in an ethical way, including using props rather than participants' own private objects and balancing multi-disciplinary perspectives (e.g., accessibility, privacy, computer vision) to meet the tangible metrics (e.g., diversity, category, amount of content) to support AI innovations. We observed challenges that our participants encountered during the data collection, including accessibility issues (e.g., understanding foreground vs. background object placement) and issues due to the sensitive nature of the content (e.g., discomfort in capturing some props such as condoms around family members).
AB - We present the design and creation of a disability-first dataset, "BIV-Priv,"which contains 728 images and 728 videos of 14 private categories captured by 26 blind participants to support downstream development of artificial intelligence (AI) models. While best practices in dataset creation typically attempt to eliminate private content, some applications require such content for model development. We describe our approach in creating this dataset with private content in an ethical way, including using props rather than participants' own private objects and balancing multi-disciplinary perspectives (e.g., accessibility, privacy, computer vision) to meet the tangible metrics (e.g., diversity, category, amount of content) to support AI innovations. We observed challenges that our participants encountered during the data collection, including accessibility issues (e.g., understanding foreground vs. background object placement) and issues due to the sensitive nature of the content (e.g., discomfort in capturing some props such as condoms around family members).
KW - accessibility
KW - blind
KW - computer vision
KW - dataset
KW - image description
KW - personal visual data
KW - privacy
KW - private visual content
KW - visual assistance
KW - visual impairments
KW - visual interpretation
UR - http://www.scopus.com/inward/record.url?scp=85160012668&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85160012668&partnerID=8YFLogxK
U2 - 10.1145/3544548.3580922
DO - 10.1145/3544548.3580922
M3 - Conference contribution
AN - SCOPUS:85160012668
T3 - Conference on Human Factors in Computing Systems - Proceedings
BT - CHI 2023 - Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
PB - Association for Computing Machinery
T2 - 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023
Y2 - 23 April 2023 through 28 April 2023
ER -