Analyzing Student SQL Solutions via Hierarchical Clustering and Sequence Alignment Scores

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Structured Query Language (SQL), the de facto standard language for relational database systems management, proves to be a vital skill for a wide array of users, developers, and researchers who interact with databases. Given that there are many diverse ways for people to acquire SQL as a skill set, and various methods to write semantically equivalent SQL queries, this presents to us both the challenge and opportunity of understanding how students learn SQL as they work on homework assignment questions. In this paper, we analyze students' SQL submissions to the homework assignment problems of the Database Systems course available to upper-level undergraduate and graduate students at the University of Illinois at Urbana-Champaign. For each student, we compute the sequence alignment scores between every submission and their final submission to understand how students reached their final solution, and whether there were any obstacles in their learning process. We also utilize hierarchical clustering techniques to create a class-wide aggregate view to determine the number of different approaches used by students in the course. We compute the resulting dendrogram visualization based upon students' final attempt to a homework problem. Our system enables instructors with more visibility to identify interesting learning patterns and approaches. These findings aim at supporting instructors to target their instruction in difficult SQL areas for the future so students may learn SQL more effectively.

Original languageEnglish (US)
Title of host publicationProceedings of the 1st ACM SIGMOD International Workshop on Data Systems Education
Subtitle of host publicationBridging Education Practice with Education Research, DataEd 2022
EditorsEfthimia Aivaloglou, George Fletcher, Daphne Miedema
PublisherAssociation for Computing Machinery
Pages10-15
Number of pages6
ISBN (Electronic)9781450393508
DOIs
StatePublished - Jun 12 2022
Externally publishedYes
Event1st ACM SIGMOD International Workshop on Data Systems Education: Bridging Education Practice with Education Research, DataEd 2022, co-located with the ACM SIGMOD Conference - Virtual, Online, United States
Duration: Jun 17 2022 → …

Publication series

NameProceedings of the 1st ACM SIGMOD International Workshop on Data Systems Education: Bridging Education Practice with Education Research, DataEd 2022

Conference

Conference1st ACM SIGMOD International Workshop on Data Systems Education: Bridging Education Practice with Education Research, DataEd 2022, co-located with the ACM SIGMOD Conference
Country/TerritoryUnited States
CityVirtual, Online
Period6/17/22 → …

Keywords

  • SQL
  • database education
  • hierarchical clustering
  • online assessment
  • sequence alignment

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture

Cite this