Data mining to inform total coliform monitoring plan design

W. J. Dawsey, Barbara S Minsker

Research output: Contribution to conferencePaperpeer-review


Monitoring of drinking water distribution systems should undertake to capture the worst case water quality scenarios in order to provide the maximum protection for public health. Total coliform monitoring is expensive and time consuming relative to other chemical or physical measures of system integrity that may be monitored in real time. This paper proposes a methodology for mining these additional water quality parameters to inform coliform monitoring. Three machine learning algorithms are selected for application to a case study distribution system: i) gradient tree boosting, ii) decision trees, and in) distance-weighted nearest neighbor algorithm. In addition, the effect of expanding training data to include unlabeled data records will be explored. The performance of these data mining techniques should provide insight into the use of surrogate water quality parameters to indicate high coliform levels. Copyright ASCE 2006.

Original languageEnglish (US)
Number of pages1
StatePublished - 2007
Event8th Annual Water Distribution Systems Analysis Symposium 2006 - Cincinnati, OH, United States
Duration: Aug 27 2006Aug 30 2006


Conference8th Annual Water Distribution Systems Analysis Symposium 2006
Country/TerritoryUnited States
CityCincinnati, OH


  • Data mining
  • Decision trees
  • Machine learning
  • Monitoring
  • Total coliform rule
  • Water distribution systems

ASJC Scopus subject areas

  • Geotechnical Engineering and Engineering Geology
  • Civil and Structural Engineering


Dive into the research topics of 'Data mining to inform total coliform monitoring plan design'. Together they form a unique fingerprint.

Cite this