TY - GEN
T1 - tailwiz
T2 - 8th Workshop on Data Management for End-to-End Machine Learning, DEEM 2024
AU - Dai, Timothy
AU - Peters, Austin
AU - Gelbach, Jonah B.
AU - Engstrom, David Freeman
AU - Kang, Daniel
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/6/9
Y1 - 2024/6/9
N2 - Experts outside the field of machine learning (ML) are interested in using ML techniques to analyze their textual data, but they are inhibited by a lack of convenient natural language processing (NLP) tools. To address this issue, we present tailwiz, an easy-to-use Python tool, powered by supervised fine-tuning of NLP models. tailwiz caters to domain experts by abstracting away technical ML knowledge and running conveniently on personal computers, the preferred mode of computation among domain experts. We show that tailwiz outperforms domain experts' current textual analysis techniques on a majority of real-world tasks, up to a 384.8% F1 increase (46.18% absolute increase). tailwiz consistently outperforms GPT-3.5-Turbo on such tasks, showing the need for fine-tuned NLP models to perform domain-specific tasks that meet the analytical demands of domain experts.
AB - Experts outside the field of machine learning (ML) are interested in using ML techniques to analyze their textual data, but they are inhibited by a lack of convenient natural language processing (NLP) tools. To address this issue, we present tailwiz, an easy-to-use Python tool, powered by supervised fine-tuning of NLP models. tailwiz caters to domain experts by abstracting away technical ML knowledge and running conveniently on personal computers, the preferred mode of computation among domain experts. We show that tailwiz outperforms domain experts' current textual analysis techniques on a majority of real-world tasks, up to a 384.8% F1 increase (46.18% absolute increase). tailwiz consistently outperforms GPT-3.5-Turbo on such tasks, showing the need for fine-tuned NLP models to perform domain-specific tasks that meet the analytical demands of domain experts.
UR - http://www.scopus.com/inward/record.url?scp=85196641173&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196641173&partnerID=8YFLogxK
U2 - 10.1145/3650203.3663328
DO - 10.1145/3650203.3663328
M3 - Conference contribution
AN - SCOPUS:85196641173
T3 - Proceedings of the 8th Workshop on Data Management for End-to-End Machine Learning, DEEM 2024 - In conjunction with the 2024 ACM SIGMOD/PODS Conference
SP - 12
EP - 22
BT - Proceedings of the 8th Workshop on Data Management for End-to-End Machine Learning, DEEM 2024 - In conjunction with the 2024 ACM SIGMOD/PODS Conference
PB - Association for Computing Machinery
Y2 - 9 June 2024 through 9 June 2024
ER -