TY - BOOK
T1 - Quality Control of 19th Century Weather Data
AU - Westcott, Nancy
PY - 2011
Y1 - 2011
N2 - The Climate Database Modernization Program's (CDMP) Forts and Volunteer Observer Database Project has resulted in a dramatic increase in the number of U.S. daily cooperative network observations available prior to 1893. Currently, data from 395 stations have been captured from the original scanned images. The stations are primarily located east of the Mississippi River, but coverage extends to all 48 contiguous U.S. states and Alaska. A rigorous quality control process is used to ensure that the keyed data matches the original form. This process involves careful collection of the metadata from the form, double-keying of the data, and a series of automated quality control tests. Values flagged by these tests are typically verified manually and corrections are applied as needed, although in some cases errors are automatically corrected. An analysis of the quality control process for 40 stations shows that on average, about 31 percent of the flags verify the information, 52 percent can be corrected, and 17 percent are deemed uncorrectable. The correctable errors typically result from unclear forms, mis-keyed data, and errors in the metadata for the image. Due to changes in observation practices since the nineteenth century, care must be taken in using the data for analysis. Despite these caveats, the nineteenth century weather dataset is being used in an increasing number of climate studies.
AB - The Climate Database Modernization Program's (CDMP) Forts and Volunteer Observer Database Project has resulted in a dramatic increase in the number of U.S. daily cooperative network observations available prior to 1893. Currently, data from 395 stations have been captured from the original scanned images. The stations are primarily located east of the Mississippi River, but coverage extends to all 48 contiguous U.S. states and Alaska. A rigorous quality control process is used to ensure that the keyed data matches the original form. This process involves careful collection of the metadata from the form, double-keying of the data, and a series of automated quality control tests. Values flagged by these tests are typically verified manually and corrections are applied as needed, although in some cases errors are automatically corrected. An analysis of the quality control process for 40 stations shows that on average, about 31 percent of the flags verify the information, 52 percent can be corrected, and 17 percent are deemed uncorrectable. The correctable errors typically result from unclear forms, mis-keyed data, and errors in the metadata for the image. Due to changes in observation practices since the nineteenth century, care must be taken in using the data for analysis. Despite these caveats, the nineteenth century weather dataset is being used in an increasing number of climate studies.
KW - ISWS
UR - http://hdl.handle.net/2142/18677
M3 - Technical report
T3 - ISWS Contract Report
BT - Quality Control of 19th Century Weather Data
PB - Illinois State Water Survey
ER -