Dense saliency-based spatiotemporal feature points for action recognition

Rapantzikos, K; Avrithis, Y; Kollias, S

dc.contributor.author	Rapantzikos, K	en
dc.contributor.author	Avrithis, Y	en
dc.contributor.author	Kollias, S	en
dc.date.accessioned	2014-03-01T01:30:06Z
dc.date.available	2014-03-01T01:30:06Z
dc.date.issued	2009	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/19472
dc.subject	Action Recognition	en
dc.subject	Information Visualization	en
dc.subject	Nearest Neighbor Classifier	en
dc.subject	Video Analysis	en
dc.subject	Space Time	en
dc.subject.other	Action recognition	en
dc.subject.other	Classification framework	en
dc.subject.other	Data sets	en
dc.subject.other	Feature point	en
dc.subject.other	Feature point detection	en
dc.subject.other	Feature similarities	en
dc.subject.other	Global minimization	en
dc.subject.other	Human actions	en
dc.subject.other	Informativeness	en
dc.subject.other	Intensity-based	en
dc.subject.other	Motion activity	en
dc.subject.other	Multiscales	en
dc.subject.other	Nearest neighbor classifiers	en
dc.subject.other	Space-Time Detectors	en
dc.subject.other	Spatial proximity	en
dc.subject.other	Spatio temporal features	en
dc.subject.other	Video analysis	en
dc.subject.other	Visual aspects	en
dc.subject.other	Visual comparison	en
dc.subject.other	Volumetric constraints	en
dc.subject.other	Volumetric representation	en
dc.subject.other	Computer vision	en
dc.subject.other	Detectors	en
dc.subject.other	Feature extraction	en
dc.subject.other	Image recognition	en
dc.subject.other	Technical presentations	en
dc.title	Dense saliency-based spatiotemporal feature points for action recognition	en
heal.type	journalArticle	en
heal.identifier.primary	10.1109/CVPRW.2009.5206525	en
heal.identifier.secondary	http://dx.doi.org/10.1109/CVPRW.2009.5206525	en
heal.identifier.secondary	5206525	en
heal.publicationDate	2009	en
heal.abstract	Several spatiotemporal feature point detectors have been recently used in video analysis for action recognition. Feature points are detected using a number of measures, namely saliency, cornerness, periodicity, motion activity etc. Each of these measures is usually intensity-based and provides a different trade-off between density and informativeness. In this paper, we use saliency for feature point detection in videos and incorporate color and motion apart from intensity. Our method uses a multi-scale volumetric representation of the video and involves spatiotemporal operations at the voxel level. Saliency is computed by a global minimization process constrained by pure volumetric constraints, each of them being related to an informative visual aspect, namely spatial proximity, scale and feature similarity (intensity, color, motion). Points are selected as the extrema of the saliency response and prove to balance well between density and informativeness. We provide an intuitive view of the detected points and visual comparisons against state-of-the-art space-time detectors. Our detector outperforms them on the KTH dataset using Nearest- Neighbor classifiers and ranks among the top using different classification frameworks. Statistics and comparisons are also performed on the more difficult Hollywood Human Actions (HOHA) dataset increasing the performance compared to current published results. ©2009 IEEE.	en
heal.journalName	2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009	en
dc.identifier.doi	10.1109/CVPRW.2009.5206525	en
dc.identifier.spage	1454	en
dc.identifier.epage	1461	en