Spatiotemporal saliency for video classification

Rapantzikos, K; Tsapatsoulis, N; Avrithis, Y; Kollias, S

dc.contributor.author	Rapantzikos, K	en
dc.contributor.author	Tsapatsoulis, N	en
dc.contributor.author	Avrithis, Y	en
dc.contributor.author	Kollias, S	en
dc.date.accessioned	2014-03-01T01:31:57Z
dc.date.available	2014-03-01T01:31:57Z
dc.date.issued	2009	en
dc.identifier.issn	0923-5965	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/19991
dc.subject	Spatiotemporal visual saliency	en
dc.subject	Video classification	en
dc.subject.classification	Engineering, Electrical & Electronic	en
dc.subject.other	Classification performance	en
dc.subject.other	Computer vision applications	en
dc.subject.other	Conspicuity	en
dc.subject.other	Heterogeneous features	en
dc.subject.other	Human visual attention	en
dc.subject.other	Optimization process	en
dc.subject.other	Saliency detection	en
dc.subject.other	Saliency measure	en
dc.subject.other	Salient regions	en
dc.subject.other	Spatiotemporal saliency	en
dc.subject.other	Spatiotemporal visual saliency	en
dc.subject.other	Spatiotemporal volume	en
dc.subject.other	Video classification	en
dc.subject.other	Video sequences	en
dc.subject.other	Visual information	en
dc.subject.other	Computer vision	en
dc.subject.other	Video recording	en
dc.subject.other	Visual communication	en
dc.subject.other	Computer applications	en
dc.title	Spatiotemporal saliency for video classification	en
heal.type	journalArticle	en
heal.identifier.primary	10.1016/j.image.2009.03.002	en
heal.identifier.secondary	http://dx.doi.org/10.1016/j.image.2009.03.002	en
heal.language	English	en
heal.publicationDate	2009	en
heal.abstract	Computer vision applications often need to process only a representative part of the visual input rather than the whole image/sequence. Considerable research has been carried out into salient region detection methods based either on models emulating human visual attention (VA) mechanisms or on computational approximations. Most of the proposed methods are bottom-up and their major goal is to filter out redundant visual information. In this paper, we propose and elaborate on a saliency detection model that treats a video sequence as a spatiotemporal volume and generates a local saliency measure for each visual unit (voxel). This computation involves an optimization process incorporating inter- and intra-feature competition at the voxel level. Perceptual decomposition of the input, spatiotemporal center-surround interactions and the integration of heterogeneous feature conspicuity values are described and an experimental framework for video classification is set up. This framework consists of a series of experiments that shows the effect of saliency in classification performance and let us draw conclusions on how well the detected salient regions represent the visual input. A comparison is attempted that shows the potential of the proposed method. (C) 2009 Elsevier B.V. All rights reserved.	en
heal.publisher	ELSEVIER SCIENCE BV	en
heal.journalName	Signal Processing: Image Communication	en
dc.identifier.doi	10.1016/j.image.2009.03.002	en
dc.identifier.isi	ISI:000270067400003	en
dc.identifier.volume	24	en
dc.identifier.issue	7	en
dc.identifier.spage	557	en
dc.identifier.epage	571	en