Keyframe extraction using local visual semantics in the form of a region thesaurus

Spyrou, E; Avrithis, Y

dc.contributor.author	Spyrou, E	en
dc.contributor.author	Avrithis, Y	en
dc.date.accessioned	2014-03-01T02:44:46Z
dc.date.available	2014-03-01T02:44:46Z
dc.date.issued	2007	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/31947
dc.subject	Hierarchical Clustering	en
dc.subject	Semantic Information	en
dc.subject	Texture Features	en
dc.subject.other	Information theory	en
dc.subject.other	Motion Picture Experts Group standards	en
dc.subject.other	Thesauri	en
dc.subject.other	Hierarchical clustering approach	en
dc.subject.other	International (CO)	en
dc.subject.other	Key-frame extraction	en
dc.subject.other	Key-frames	en
dc.subject.other	Local regions	en
dc.subject.other	Media adaptation	en
dc.subject.other	Personalization	en
dc.subject.other	Semantic features	en
dc.subject.other	Semantic information	en
dc.subject.other	Texture features	en
dc.subject.other	Video shots	en
dc.subject.other	Visual semantics	en
dc.subject.other	Semantics	en
dc.title	Keyframe extraction using local visual semantics in the form of a region thesaurus	en
heal.type	conferenceItem	en
heal.identifier.primary	10.1109/SMAP.2007.4414394	en
heal.identifier.secondary	http://dx.doi.org/10.1109/SMAP.2007.4414394	en
heal.identifier.secondary	4414394	en
heal.publicationDate	2007	en
heal.abstract	This paper presents an approach for efficient keyframe extraction, using local semantics inform of a region thesaurus. More specifically, certain MPEG-7 color and texture features are locally extracted from keyframe regions. Then, using a hierarchical clustering approach a local region thesaurus is constructed to facilitate the description of each frame in terms of higher semantic features. The thesaurus consists of the most common region types that are encountered within the video shot, along with their synonyms. These region types carry semantic information. Each keyframe is represented by a vector consisting of the degrees of confidence of the existence of all region types within this shot. Using this keyframe representation, the most representative keyframe is then selected for each shot. Where a single keyframe is not adequate, using the same algorithm and exploiting the presence of the region types of the visual thesaurus, more keyframes are extracted. © 2007 IEEE.	en
heal.journalName	SMAP07 - Second International Workshop on Semantic Media Adaptation and Personalization	en
dc.identifier.doi	10.1109/SMAP.2007.4414394	en
dc.identifier.spage	98	en
dc.identifier.epage	103	en