A feature-first approach to clustering for highlighting regions of interest in scientific data

Research output: Contribution to journalConference article

Abstract

We present a clustering algorithm that classifies the points of a dataset by a combination of scalar variables' values as well as spatial locations. How heavily the spatial locations impact the algorithm is a tunable parameter. With no impact the algorithm bins the data by calculating a histogram and classifies each point by a bin ID. With full impact, points are bunched together by spatial neighborhood regardless of value. This approach is unsurprisingly very sensitive to this weighting; a sampling of possible values yields a wide variety of classifications. However, we have found that when tuned just right it is indeed possible to extract meaningful features from the resulting clustering. Furthermore, the principles behind our development of this technique are also applicable in both tuning the algorithm as well as in selecting data regions. In this paper we will provide the details of design and implementation and demonstrate using the auto-tuned approach to extract interesting regions of real scientific data. Our target application is the automatic detection of land cover data anomalies in NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) sensors.

Original languageEnglish (US)
Pages (from-to)2207-2216
Number of pages10
JournalProcedia Computer Science
Volume51
Issue number1
DOIs
StatePublished - Jan 1 2015
EventInternational Conference on Computational Science, ICCS 2002 - Amsterdam, Netherlands
Duration: Apr 21 2002Apr 24 2002

    Fingerprint

Keywords

  • Anomaly detection
  • MODIS
  • Parallel computing
  • Visualization

ASJC Scopus subject areas

  • Computer Science(all)

Cite this