An LLM-Based Framework for Simulating, Classifying, and Correcting Students' Programming Knowledge with the SOLO Taxonomy

Shan Zhang, Pragati Shuddhodhan Meshram, Priyadharshini Ganapathy Prasad, Maya Israel, Suma Bhat

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Novice programmers often face challenges in designing computational artifacts and fixing code errors, which can lead to task abandonment and over-reliance on external support. While research has explored effective meta-cognitive strategies to scaffold novice programmers' learning, it is essential to first understand and assess students' conceptual, procedural, and strategic/conditional programming knowledge at scale. To address this issue, we propose a three-model framework that leverages Large Language Models (LLMs) to simulate, classify, and correct student responses to programming questions based on the SOLO Taxonomy. The SOLO Taxonomy provides a structured approach for categorizing student understanding into four levels: Pre-structural, Uni-structural, Multi-structural, and Relational. Our results showed that GPT-4o achieved high accuracy in generating and classifying responses for the Relational category, with moderate accuracy in the Uni-structural and Pre-structural categories, but struggled with the Multi-structural category. The model successfully corrected responses to the Relational level. Although further refinement is needed, these findings suggest that LLMs hold significant potential for supporting computer science education by assessing programming knowledge and guiding students toward deeper cognitive engagement.

Original languageEnglish (US)
Title of host publicationSIGCSE TS 2025 - Proceedings of the 56th ACM Technical Symposium on Computer Science Education
PublisherAssociation for Computing Machinery
Pages1681-1682
Number of pages2
ISBN (Electronic)9798400705328
DOIs
StatePublished - Feb 18 2025
Event56th Annual SIGCSE Technical Symposium on Computer Science Education, SIGCSE TS 2025 - Pittsburgh, United States
Duration: Feb 26 2025Mar 1 2025

Publication series

NameSIGCSE TS 2025 - Proceedings of the 56th ACM Technical Symposium on Computer Science Education
Volume2

Conference

Conference56th Annual SIGCSE Technical Symposium on Computer Science Education, SIGCSE TS 2025
Country/TerritoryUnited States
CityPittsburgh
Period2/26/253/1/25

Keywords

  • Computer Science Education
  • Large Language Model
  • Solo Taxonomy

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Education

Fingerprint

Dive into the research topics of 'An LLM-Based Framework for Simulating, Classifying, and Correcting Students' Programming Knowledge with the SOLO Taxonomy'. Together they form a unique fingerprint.

Cite this