Natural language processing for automated regulatory and contractual document analysis

D. M. Salama, Nora El-Gohary

Research output: Contribution to conferencePaperpeer-review

Abstract

Automated regulatory and contractual compliance checking in the construction domain requires automated complex processing, interpretation, and analysis of laws, regulations, and contractual terms, which are commonly expressed in textual documents. The first step in automating the text analysis process is automating text classification. For example, for contractual documents, automated text classification involves classification of different contractual clauses (or parts of clauses) into different groups (such as payment, material specifications, equipment specifications, green construction requirements, conflict resolution clauses, etc). This paper studies the different methods of text classification (a sub-field of natural language processing (NLP)). Text classification is the process of identifying the group to which a piece of text belongs. The process can be either supervised (human-guided) or unsupervised (completely automated). The paper discusses the applicability of both approaches. Different text classification methods such as naïve Bayes classifier, support vector machines, classification trees, and maximum entropy, are presented and analyzed in the context of construction contract text classification. The paper also introduces a prototype for classifying construction contractual clauses (and sub-clauses).

Original languageEnglish (US)
Pages2897-2906
Number of pages10
StatePublished - Dec 1 2011
EventAnnual Conference of the Canadian Society for Civil Engineering 2011, CSCE 2011 - Ottawa, ON, Canada
Duration: Jun 14 2011Jun 17 2011

Other

OtherAnnual Conference of the Canadian Society for Civil Engineering 2011, CSCE 2011
CountryCanada
CityOttawa, ON
Period6/14/116/17/11

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Natural language processing for automated regulatory and contractual document analysis'. Together they form a unique fingerprint.

Cite this