TY - GEN
T1 - Analyzing Student SQL Solutions via Hierarchical Clustering and Sequence Alignment Scores
AU - Yang, Sophia
AU - Herman, Geoffrey Lindsay
AU - Alawini, Abdussalam
N1 - Funding Information:
This work is funded by the National Science Foundation (NSF) award number 2021499.
Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/6/12
Y1 - 2022/6/12
N2 - Structured Query Language (SQL), the de facto standard language for relational database systems management, proves to be a vital skill for a wide array of users, developers, and researchers who interact with databases. Given that there are many diverse ways for people to acquire SQL as a skill set, and various methods to write semantically equivalent SQL queries, this presents to us both the challenge and opportunity of understanding how students learn SQL as they work on homework assignment questions. In this paper, we analyze students' SQL submissions to the homework assignment problems of the Database Systems course available to upper-level undergraduate and graduate students at the University of Illinois at Urbana-Champaign. For each student, we compute the sequence alignment scores between every submission and their final submission to understand how students reached their final solution, and whether there were any obstacles in their learning process. We also utilize hierarchical clustering techniques to create a class-wide aggregate view to determine the number of different approaches used by students in the course. We compute the resulting dendrogram visualization based upon students' final attempt to a homework problem. Our system enables instructors with more visibility to identify interesting learning patterns and approaches. These findings aim at supporting instructors to target their instruction in difficult SQL areas for the future so students may learn SQL more effectively.
AB - Structured Query Language (SQL), the de facto standard language for relational database systems management, proves to be a vital skill for a wide array of users, developers, and researchers who interact with databases. Given that there are many diverse ways for people to acquire SQL as a skill set, and various methods to write semantically equivalent SQL queries, this presents to us both the challenge and opportunity of understanding how students learn SQL as they work on homework assignment questions. In this paper, we analyze students' SQL submissions to the homework assignment problems of the Database Systems course available to upper-level undergraduate and graduate students at the University of Illinois at Urbana-Champaign. For each student, we compute the sequence alignment scores between every submission and their final submission to understand how students reached their final solution, and whether there were any obstacles in their learning process. We also utilize hierarchical clustering techniques to create a class-wide aggregate view to determine the number of different approaches used by students in the course. We compute the resulting dendrogram visualization based upon students' final attempt to a homework problem. Our system enables instructors with more visibility to identify interesting learning patterns and approaches. These findings aim at supporting instructors to target their instruction in difficult SQL areas for the future so students may learn SQL more effectively.
KW - database education
KW - hierarchical clustering
KW - online assessment
KW - sequence alignment
KW - SQL
UR - http://www.scopus.com/inward/record.url?scp=85133170909&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133170909&partnerID=8YFLogxK
U2 - 10.1145/3531072.3535319
DO - 10.1145/3531072.3535319
M3 - Conference contribution
AN - SCOPUS:85133170909
T3 - Proceedings of the 1st ACM SIGMOD International Workshop on Data Systems Education: Bridging Education Practice with Education Research, DataEd 2022
SP - 10
EP - 15
BT - Proceedings of the 1st ACM SIGMOD International Workshop on Data Systems Education
A2 - Aivaloglou, Efthimia
A2 - Fletcher, George
A2 - Miedema, Daphne
PB - Association for Computing Machinery, Inc
T2 - 1st ACM SIGMOD International Workshop on Data Systems Education: Bridging Education Practice with Education Research, DataEd 2022, co-located with the ACM SIGMOD Conference
Y2 - 17 June 2022
ER -