Automated environmental compliance checking requires automated extraction of rules from environmental regulatory textual documents, such as energy conservation codes and U.S. Environmental Protection Agency (EPA) regulations. Automated rule extraction requires complex text processing and analysis for information extraction and subsequent formalization of the extracted information into computer-processable rules. In our automated compliance checking (ACC) approach, we first classify the text into predefined categories to filter out irrelevant text, thereby improving further semantic information extraction and compliance reasoning efficiency. The categories used are predefined in a semantic text classification (TC) topic hierarchy. In this paper, we present our machine-learning-based TC algorithm for classifying clauses in environmental regulatory documents based on the TC topic hierarchy. In developing our TC algorithm, different text preprocessing techniques, machine learning algorithms, and performance improvement strategies were tested and evaluated. Our final TC algorithm was tested on 10 regulatory documents, such as the 2012 International Energy Conservation Code, and evaluated in terms of precision and recall. The algorithm achieved around 96% and 85% recall and precision, respectively, on the testing data.