TY - GEN
T1 - Beyond User Experience
T2 - 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp Companion 2024
AU - Tang, Yiliu
AU - Situ, Jason
AU - Huang, Yun
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/10/5
Y1 - 2024/10/5
N2 - Spatial Computing involves interacting with the physical world through spatial data manipulation, closely linked with Extended Reality (XR), which includes Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR). Large Language Models (LLMs) significantly enhance XR applications by improving user interactions through natural language understanding and content generation. Typical evaluations of these applications focus on user experience (UX) metrics, such as task performance, user satisfaction, and psychological assessments, but often neglect the technical performance of the LLMs themselves. This paper identifies significant gaps in current evaluation practices for LLMs within XR environments, attributing them to the novelty of the field, the complexity of spatial contexts, and the multimodal nature of interactions in XR. To address these gaps, the paper proposes specific metrics tailored to evaluate LLM performance in XR contexts, including spatial contextual awareness, coherence, proactivity, multimodal integration, hallucination, and question-answering accuracy. These proposed metrics aim to complement existing UX evaluations, providing a comprehensive assessment framework that captures both the technical and user-centric aspects of LLM performance in XR applications. The conclusion underscores the necessity for a dual-focused approach that combines technical and UX metrics to ensure effective and user-friendly LLM-integrated XR systems.
AB - Spatial Computing involves interacting with the physical world through spatial data manipulation, closely linked with Extended Reality (XR), which includes Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR). Large Language Models (LLMs) significantly enhance XR applications by improving user interactions through natural language understanding and content generation. Typical evaluations of these applications focus on user experience (UX) metrics, such as task performance, user satisfaction, and psychological assessments, but often neglect the technical performance of the LLMs themselves. This paper identifies significant gaps in current evaluation practices for LLMs within XR environments, attributing them to the novelty of the field, the complexity of spatial contexts, and the multimodal nature of interactions in XR. To address these gaps, the paper proposes specific metrics tailored to evaluate LLM performance in XR contexts, including spatial contextual awareness, coherence, proactivity, multimodal integration, hallucination, and question-answering accuracy. These proposed metrics aim to complement existing UX evaluations, providing a comprehensive assessment framework that captures both the technical and user-centric aspects of LLM performance in XR applications. The conclusion underscores the necessity for a dual-focused approach that combines technical and UX metrics to ensure effective and user-friendly LLM-integrated XR systems.
KW - Augmented Reality
KW - Evaluation Metrics
KW - Extended Reality
KW - Large Language Models
KW - Mixed Reality
KW - Spatial Computing
KW - User Experience
KW - Virtual Reality
UR - http://www.scopus.com/inward/record.url?scp=85206203437&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85206203437&partnerID=8YFLogxK
U2 - 10.1145/3675094.3678995
DO - 10.1145/3675094.3678995
M3 - Conference contribution
AN - SCOPUS:85206203437
T3 - UbiComp Companion 2024 - Companion of the 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing
SP - 640
EP - 643
BT - UbiComp Companion 2024 - Companion of the 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing
PB - Association for Computing Machinery
Y2 - 5 October 2024 through 9 October 2024
ER -