TY - JOUR
T1 - Clustering-Based Approach for Building Code Computability Analysis
AU - Zhang, Ruichuan
AU - El-Gohary, Nora
N1 - Funding Information:
The authors would like to thank the National Science Foundation (NSF). This material is based on work supported by the NSF under Grant No. 1827733. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
Publisher Copyright:
© 2021 American Society of Civil Engineers.
PY - 2021/11/1
Y1 - 2021/11/1
N2 - One common limitation of all automated code compliance-checking methods and tools is their inability to deal with all types of building-code requirements. More research is needed to better identify the different types of requirements, in terms of their syntactic and semantic structures and complexities, to gain more insights about the capabilities and limitations of existing methods and tools (i.e., which requirements they can automatically process, represent, or check, and which not). To address this need, this paper proposes a new set of syntactic and semantic features and complexity and computability metrics for code computability analysis. A clustering-based approach was used to identify the different types of code sentences, in terms of their computability, using the proposed features and metrics. The approach was implemented and tested on a corpus of 6,608 sentences from the International Building Code and its amendments. The sentence clusters and identified sentence types were evaluated using intrinsic and extrinsic evaluation methods. The evaluation results indicated good clustering performance, perfect alignment between the human- and computer-identified types, and good agreement in the assignment of sentences to the types.
AB - One common limitation of all automated code compliance-checking methods and tools is their inability to deal with all types of building-code requirements. More research is needed to better identify the different types of requirements, in terms of their syntactic and semantic structures and complexities, to gain more insights about the capabilities and limitations of existing methods and tools (i.e., which requirements they can automatically process, represent, or check, and which not). To address this need, this paper proposes a new set of syntactic and semantic features and complexity and computability metrics for code computability analysis. A clustering-based approach was used to identify the different types of code sentences, in terms of their computability, using the proposed features and metrics. The approach was implemented and tested on a corpus of 6,608 sentences from the International Building Code and its amendments. The sentence clusters and identified sentence types were evaluated using intrinsic and extrinsic evaluation methods. The evaluation results indicated good clustering performance, perfect alignment between the human- and computer-identified types, and good agreement in the assignment of sentences to the types.
KW - Buildings
KW - Code checking
KW - Computability
KW - Hierarchical clustering
KW - Text analytics
UR - http://www.scopus.com/inward/record.url?scp=85113802792&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113802792&partnerID=8YFLogxK
U2 - 10.1061/(ASCE)CP.1943-5487.0000967
DO - 10.1061/(ASCE)CP.1943-5487.0000967
M3 - Article
AN - SCOPUS:85113802792
SN - 0887-3801
VL - 35
JO - Journal of Computing in Civil Engineering
JF - Journal of Computing in Civil Engineering
IS - 6
M1 - 04021021
ER -