AUTOMATED SUB-FEATURE LABELING USING PROMPT-BASED PRETRAINED LANGUAGE MODEL

Seyoung Park, Yilan Jiang, Harrison Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many studies have been utilizing online user-generated data to draw product design implications via supervised and unsupervised approaches. While the supervised learning methods typically yield higher performance, they demand extensive data labeling tasks, consuming significant time and effort. This study proposes a framework that automatically labels online user data to address this limitation. The proposed framework consists of two pseudo-labeling mechanisms, key word detection and prompting Pretraied Language Model (PLM). The first stage defines key word for the target topic and then labels datasets by checking if the data contains these key word. The second stage employs the PLM and labels datasets based on their context. Specifically, Prompting PLM adds a task-specific template at the end of the given text data (review) and predicts the masked token (label). This PLM-based approach serves as promising labeling candidates as they can make predictions without additional training data from the target domain. The suggested method was tested on a case study with real-world datasets. The study validates the effectiveness of this novel framework by comparing the pseudo-labeled results on smartphone sub-features to manual ground-truths. The results demonstrate that the new framework achieves F1 scores 28% and 14% higher than a baseline for screen and battery, respectively.

Original languageEnglish (US)
Title of host publication50th Design Automation Conference (DAC)
PublisherAmerican Society of Mechanical Engineers (ASME)
ISBN (Electronic)9780791888360
DOIs
StatePublished - 2024
Externally publishedYes
EventASME 2024 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC-CIE 2024 - Washington, United States
Duration: Aug 25 2024Aug 28 2024

Publication series

NameProceedings of the ASME Design Engineering Technical Conference
Volume3A-2024

Conference

ConferenceASME 2024 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC-CIE 2024
Country/TerritoryUnited States
CityWashington
Period8/25/248/28/24

ASJC Scopus subject areas

  • Mechanical Engineering
  • Computer Graphics and Computer-Aided Design
  • Computer Science Applications
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'AUTOMATED SUB-FEATURE LABELING USING PROMPT-BASED PRETRAINED LANGUAGE MODEL'. Together they form a unique fingerprint.

Cite this