An improved grade point average, with applications to CS undergraduate education analytics

Research output: Contribution to journalArticlepeer-review


We present a methodological improvement for calculating Grade Point Averages (GPAs). Heterogeneity in grading between courses systematically biases observed GPAs for individual students: the GPA observed depends on course selection. We show how a logistic model can account for course selection by simulating how every student in a sample would perform if they took all available courses, giving a new “modeled GPA.” We then use 10 years of grade data from a large university to demonstrate that this modeled GPA is a more accurate predictor of student performance in individual courses than the observed GPA. Using Computer Science (CS) as an example learning analytics application, it is found that required CS courses give significantly lower grades than average courses. This depresses the recorded GPAs of CS majors: modeled GPAs are 0.25 points higher than those that are observed. The modeled GPA also correlates much more closely with standardized test scores than the observed GPA: the correlation with Math ACT is 0.37 for the modeled GPA and is 0.20 for the observed GPA. This implies that standardized test scores are much better predictors of student performance than might otherwise be assumed.

Original languageEnglish (US)
Article number17
JournalACM Transactions on Computing Education
Issue number4
StatePublished - Sep 2018


  • GPA
  • Gender disparity
  • Learning analytics
  • Women in computing

ASJC Scopus subject areas

  • General Computer Science
  • Education


Dive into the research topics of 'An improved grade point average, with applications to CS undergraduate education analytics'. Together they form a unique fingerprint.

Cite this