Recognizing activities in multiple views with fusion of frame judgments

Selen Pehlivan, David A. Forsyth

Research output: Contribution to journalArticlepeer-review


This paper focuses on activity recognition when multiple views are available. In the literature, this is often performed using two different approaches. In the first one, the systems build a 3D reconstruction and match that. However, there are practical disadvantages to this methodology since a sufficient number of overlapping views is needed to reconstruct, and one must calibrate the cameras. A simpler alternative is to match the frames individually. This offers significant advantages in the system architecture (e.g., it is easy to incorporate new features and camera dropouts can be tolerated). In this paper, the second approach is employed and a novel fusion method is proposed. Our fusion method collects the activity labels over frames and cameras, and then fuses activity judgments as the sequence label. It is shown that there is no performance penalty when a straightforward weighted voting scheme is used. In particular, when there are enough overlapping views to generate a volumetric reconstruction, our recognition performance is comparable with that produced by volumetric reconstructions. However, if the overlapping views are not adequate, the performance degrades fairly gracefully, even in cases where test and training views do not overlap.

Original languageEnglish (US)
Pages (from-to)237-249
Number of pages13
JournalImage and Vision Computing
Issue number4
StatePublished - Apr 2014


  • Human activity recognition
  • Multiple camera
  • Multiple views
  • Video analysis

ASJC Scopus subject areas

  • Signal Processing
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Recognizing activities in multiple views with fusion of frame judgments'. Together they form a unique fingerprint.

Cite this