TY - GEN
T1 - Prob2Vec
T2 - 2020 American Control Conference, ACC 2020
AU - Su, Du
AU - Yekkehkhany, Ali
AU - Lu, Yi
AU - Lu, Wenmiao
N1 - Publisher Copyright:
© 2020 AACC.
PY - 2020/7
Y1 - 2020/7
N2 - We propose a novel mathematical semantic embedding for problem retrieval in adaptive tutoring. The goal is to retrieve problems with similar mathematical concepts. There are two challenges: First, problems conducive to tutoring are never exactly the same in terms of underlying concepts: those problems often mix concepts in innovative ways. Second, it is difficult for human to determine a consistent similarity score across a large enough training set. To address these two challenges, we develop a hierarchical problem embedding algorithm, Prob2Vec, which consists of abstraction and embedding steps. Prob2Vec is able to distinguish very finegrained differences among problems, an ability humans need time and effort to acquire. In addition, the associated concept labeling is a multi-label problem with imbalanced training data set suffering from dimensionality explosion. Robust concept labeling is achieved with a novel negative pre-training algorithm that dramatically reduces false negative and positive ratios for classification. Experimental results show that Prob2Vec achieves 96.88% accuracy on a problem similarity test, in contrast to 75% from directly applying state-of-the-art sentence embedding methods.
AB - We propose a novel mathematical semantic embedding for problem retrieval in adaptive tutoring. The goal is to retrieve problems with similar mathematical concepts. There are two challenges: First, problems conducive to tutoring are never exactly the same in terms of underlying concepts: those problems often mix concepts in innovative ways. Second, it is difficult for human to determine a consistent similarity score across a large enough training set. To address these two challenges, we develop a hierarchical problem embedding algorithm, Prob2Vec, which consists of abstraction and embedding steps. Prob2Vec is able to distinguish very finegrained differences among problems, an ability humans need time and effort to acquire. In addition, the associated concept labeling is a multi-label problem with imbalanced training data set suffering from dimensionality explosion. Robust concept labeling is achieved with a novel negative pre-training algorithm that dramatically reduces false negative and positive ratios for classification. Experimental results show that Prob2Vec achieves 96.88% accuracy on a problem similarity test, in contrast to 75% from directly applying state-of-the-art sentence embedding methods.
UR - http://www.scopus.com/inward/record.url?scp=85089576696&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089576696&partnerID=8YFLogxK
U2 - 10.23919/ACC45564.2020.9147767
DO - 10.23919/ACC45564.2020.9147767
M3 - Conference contribution
AN - SCOPUS:85089576696
T3 - Proceedings of the American Control Conference
SP - 2490
EP - 2495
BT - 2020 American Control Conference, ACC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 July 2020 through 3 July 2020
ER -