Enrollment in introductory programming (CS1) courses continues to surge and hundreds of CS1 students can produce thousands of submissions for a single problem, all requiring timely and accurate grading. One way that instructors can efficiently grade is to construct a custom instructor test suite that compares a student submission to a reference solution over randomly generated or hand-crafted inputs. However, such test suite is often insufficient, causing incorrect submissions to be marked as correct. To address this issue, we propose the Grasa (GRAding-based test Suite Augmentation) approach consisting of two techniques. Grasa first detects and clusters incorrect submissions by approximating their behavioral equivalence to each other. To augment the existing instructor test suite, Grasa generates a minimal set of additional tests that help detect the incorrect submissions. We evaluate our Grasa approach on a dataset of CS1 student submissions for three programming problems. Our preliminary results show that Grasa can effectively identify incorrect student submissions and minimally augment the instructor test suite.