Unsupervised Approach for Modeling Content Structures of MOOCs

Fareedah ALSaad, Abdussalam Alawini

Research output: Chapter in Book/Report/Conference proceedingConference contribution


With the increased number of MOOC offerings, it is unclear how these courses are related. Previous work has focused on capturing the prerequisite relationships between courses, lectures, and concepts. However, it is also essential to model the content structure of MOOC courses. Constructing a precedence graph that models the similarities and variations of learning paths followed by similar MOOCs would help both students and instructors. Students can personalize their learning by choosing the desired learning path and lectures across several courses guided by the precedence graph. Similarly, by examining the precedence graph, instructors can 1) identify knowledge gaps in their MOOC offerings, and 2) find alternative course plans. In this paper, we propose an unsupervised approach to build the precedence graph of similar MOOCs, where nodes are clusters of lectures with similar content, and edges depict alternative precedence relationships. Our approach to cluster similar lectures based on PCK-Means clustering algorithm that incorporates pairwise constraints: Must-Link and Cannot-Link with the standard K-Means algorithm. To build the precedence graph, we link the clusters according to the precedence relations mined from current MOOCs. Experiments over real-world MOOC data show that PCK-Means with our proposed pairwise constraints outperform the K-Means algorithm in both Adjusted Mutual Information (AMI) and Fowlkes-Mallows scores (FMI).

Original languageEnglish (US)
Title of host publicationProceedings of the 13th International Conference on Educational Data Mining, EDM 2020
EditorsAnna N. Rafferty, Jacob Whitehill, Cristobal Romero, Violetta Cavalli-Sforza
PublisherInternational Educational Data Mining Society
Number of pages11
ISBN (Electronic)9781733673617
StatePublished - 2020
Event13th International Conference on Educational Data Mining, EDM 2020 - Virtual, Online
Duration: Jul 10 2020Jul 13 2020

Publication series

NameProceedings of the 13th International Conference on Educational Data Mining, EDM 2020


Conference13th International Conference on Educational Data Mining, EDM 2020
CityVirtual, Online


  • Alternative Learning Paths
  • Clustering
  • Common Learning Path
  • Pairwise Constraints
  • Precedence Graph
  • Precedence Relations

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems


Dive into the research topics of 'Unsupervised Approach for Modeling Content Structures of MOOCs'. Together they form a unique fingerprint.

Cite this