Unsupervised Machine Learning for Augmented Data Analytics of Building Codes

Ruichuan Zhang, Nora El-Gohary

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Existing automated code checking methods/tools are unable to automatically analyze and represent all types of requirements (e.g., requirements that are too complex or that require human judgement). Recent efforts in the area of augmented data analytics have proposed the use of templates to facilitate the analysis of text. However, most of these efforts have constructed such templates manually, which is labor-intensive. More importantly, it is difficult for manually-developed templates to capture the linguistic variations in building codes. More research is, thus, needed to automate the generation of templates to support the tagging and extraction of information from building codes. To address this need, this paper proposes an unsupervised machine-learning based method to extract sentence templates that describe syntactic and semantic features and patterns from building codes. The proposed method is composed of four main steps: (1) data preprocessing; (2) identifying the different groups of sentence fragments using clustering; (3) identifying the fixed parts and the slots in the templates based on the syntactic and semantic patterns of the sentence fragment groups; and (4) evaluating the extracted templates. The proposed method was implemented and tested on a corpus of text from the International Building Code. An accuracy of 0.76 was achieved.

Original languageEnglish (US)
Title of host publicationComputing in Civil Engineering 2019
Subtitle of host publicationData, Sensing, and Analytics - Selected Papers from the ASCE International Conference on Computing in Civil Engineering 2019
EditorsYong K. Cho, Fernanda Leite, Amir Behzadan, Chao Wang
PublisherAmerican Society of Civil Engineers
Pages74-81
Number of pages8
ISBN (Electronic)9780784482438
StatePublished - 2019
EventASCE International Conference on Computing in Civil Engineering 2019: Data, Sensing, and Analytics, i3CE 2019 - Atlanta, United States
Duration: Jun 17 2019Jun 19 2019

Publication series

NameComputing in Civil Engineering 2019: Data, Sensing, and Analytics - Selected Papers from the ASCE International Conference on Computing in Civil Engineering 2019

Conference

ConferenceASCE International Conference on Computing in Civil Engineering 2019: Data, Sensing, and Analytics, i3CE 2019
Country/TerritoryUnited States
CityAtlanta
Period6/17/196/19/19

ASJC Scopus subject areas

  • Computer Science(all)
  • Civil and Structural Engineering

Fingerprint

Dive into the research topics of 'Unsupervised Machine Learning for Augmented Data Analytics of Building Codes'. Together they form a unique fingerprint.

Cite this