Question answering via integer programming over semi-structured knowledge

Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Peter Clark, Oren Etzioni, Dan Roth

Research output: Contribution to journalConference article

Abstract

Answering science questions posed in natural language is an important AI challenge. Answering such questions often requires non-trivial inference and knowledge that goes beyond factoid retrieval. Yet, most systems for this task are based on relatively shallow Information Retrieval (IR) and statistical correlation techniques operating on large unstructured corpora. We propose a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts. On a dataset of real, unseen science questions, our system significantly outperforms (+14%) the best previous attempt at structured reasoning for this task, which used Markov Logic Networks (MLNs). It also improves upon a previous ILP formulation by 17.7%. When combined with unstructured inference methods, the ILP system significantly boosts overall performance (+10%). Finally, we show our approach is substantially more robust to a simple answer perturbation compared to statistical correlation methods.

Original languageEnglish (US)
Pages (from-to)1145-1152
Number of pages8
JournalIJCAI International Joint Conference on Artificial Intelligence
Volume2016-January
StatePublished - Jan 1 2016
Event25th International Joint Conference on Artificial Intelligence, IJCAI 2016 - New York, United States
Duration: Jul 9 2016Jul 15 2016

Fingerprint

Correlation methods
Integer programming
Information retrieval

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Khashabi, D., Khot, T., Sabharwal, A., Clark, P., Etzioni, O., & Roth, D. (2016). Question answering via integer programming over semi-structured knowledge. IJCAI International Joint Conference on Artificial Intelligence, 2016-January, 1145-1152.

Question answering via integer programming over semi-structured knowledge. / Khashabi, Daniel; Khot, Tushar; Sabharwal, Ashish; Clark, Peter; Etzioni, Oren; Roth, Dan.

In: IJCAI International Joint Conference on Artificial Intelligence, Vol. 2016-January, 01.01.2016, p. 1145-1152.

Research output: Contribution to journalConference article

Khashabi, D, Khot, T, Sabharwal, A, Clark, P, Etzioni, O & Roth, D 2016, 'Question answering via integer programming over semi-structured knowledge', IJCAI International Joint Conference on Artificial Intelligence, vol. 2016-January, pp. 1145-1152.
Khashabi D, Khot T, Sabharwal A, Clark P, Etzioni O, Roth D. Question answering via integer programming over semi-structured knowledge. IJCAI International Joint Conference on Artificial Intelligence. 2016 Jan 1;2016-January:1145-1152.
Khashabi, Daniel ; Khot, Tushar ; Sabharwal, Ashish ; Clark, Peter ; Etzioni, Oren ; Roth, Dan. / Question answering via integer programming over semi-structured knowledge. In: IJCAI International Joint Conference on Artificial Intelligence. 2016 ; Vol. 2016-January. pp. 1145-1152.
@article{f3e8fbad9474459aa2e822d5e4638905,
title = "Question answering via integer programming over semi-structured knowledge",
abstract = "Answering science questions posed in natural language is an important AI challenge. Answering such questions often requires non-trivial inference and knowledge that goes beyond factoid retrieval. Yet, most systems for this task are based on relatively shallow Information Retrieval (IR) and statistical correlation techniques operating on large unstructured corpora. We propose a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts. On a dataset of real, unseen science questions, our system significantly outperforms (+14{\%}) the best previous attempt at structured reasoning for this task, which used Markov Logic Networks (MLNs). It also improves upon a previous ILP formulation by 17.7{\%}. When combined with unstructured inference methods, the ILP system significantly boosts overall performance (+10{\%}). Finally, we show our approach is substantially more robust to a simple answer perturbation compared to statistical correlation methods.",
author = "Daniel Khashabi and Tushar Khot and Ashish Sabharwal and Peter Clark and Oren Etzioni and Dan Roth",
year = "2016",
month = "1",
day = "1",
language = "English (US)",
volume = "2016-January",
pages = "1145--1152",
journal = "IJCAI International Joint Conference on Artificial Intelligence",
issn = "1045-0823",

}

TY - JOUR

T1 - Question answering via integer programming over semi-structured knowledge

AU - Khashabi, Daniel

AU - Khot, Tushar

AU - Sabharwal, Ashish

AU - Clark, Peter

AU - Etzioni, Oren

AU - Roth, Dan

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Answering science questions posed in natural language is an important AI challenge. Answering such questions often requires non-trivial inference and knowledge that goes beyond factoid retrieval. Yet, most systems for this task are based on relatively shallow Information Retrieval (IR) and statistical correlation techniques operating on large unstructured corpora. We propose a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts. On a dataset of real, unseen science questions, our system significantly outperforms (+14%) the best previous attempt at structured reasoning for this task, which used Markov Logic Networks (MLNs). It also improves upon a previous ILP formulation by 17.7%. When combined with unstructured inference methods, the ILP system significantly boosts overall performance (+10%). Finally, we show our approach is substantially more robust to a simple answer perturbation compared to statistical correlation methods.

AB - Answering science questions posed in natural language is an important AI challenge. Answering such questions often requires non-trivial inference and knowledge that goes beyond factoid retrieval. Yet, most systems for this task are based on relatively shallow Information Retrieval (IR) and statistical correlation techniques operating on large unstructured corpora. We propose a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts. On a dataset of real, unseen science questions, our system significantly outperforms (+14%) the best previous attempt at structured reasoning for this task, which used Markov Logic Networks (MLNs). It also improves upon a previous ILP formulation by 17.7%. When combined with unstructured inference methods, the ILP system significantly boosts overall performance (+10%). Finally, we show our approach is substantially more robust to a simple answer perturbation compared to statistical correlation methods.

UR - http://www.scopus.com/inward/record.url?scp=85006110347&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006110347&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85006110347

VL - 2016-January

SP - 1145

EP - 1152

JO - IJCAI International Joint Conference on Artificial Intelligence

JF - IJCAI International Joint Conference on Artificial Intelligence

SN - 1045-0823

ER -