Abstract
Vast and ever increasing quantities of data are produced by sensors in the Internet of Things (IoT). The quality of this data can be
very variable due to problems with sensors, incorrect calibration etc. Data quality can be greatly enhanced by cleaning the data
before it reaches its end user. This paper reports on the construction of a distributed cleaning system (DCS) to clean data streams in
real-time for an environmental case-study. A combination of declarative and statistical model based cleaning methods are applied
and initial results are reported.