Finding contextually consistent information units in legal text

Dominic Seyler, Paul Bruin, Pavan Bayyapu, Cheng Xiang Zhai

Research output: Contribution to journalConference articlepeer-review


Terms in the laws of a legislature can be highly contextual: Especially for corpora of codified laws and regulations where the reader has to be aware of the correct context when the corpus lacks a single level of hierarchy. The goal of this work is to assist professionals when reading legal text within a codified corpus by finding contextually consistent information units. To achieve this, we combine NLP and data mining techniques to develop novel methodology that can find these information units in an unsupervised manner. Our method draws on expert experience and is modeled to emulate the "contextualization process" of experienced readers of legal content. We experimentally evaluate our method by comparing it to multiple expert-annotated datasets and find that our method achieves near perfect performance on four state corpora and high precision on one federal corpus.

Original languageEnglish (US)
Pages (from-to)48-51
Number of pages4
JournalCEUR Workshop Proceedings
StatePublished - 2020
EventSAE 2020 Automotive Technical Papers, WONLYAUTO 2020 - Warrendale, United States
Duration: Jan 1 2020 → …


  • Information units
  • Legal text mining
  • Logical document organization

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'Finding contextually consistent information units in legal text'. Together they form a unique fingerprint.

Cite this