Concept detection and keyframe extraction using a visual thesaurus

Spyrou, E; Tolias, G; Mylonas, P; Avrithis, Y

dc.contributor.author	Spyrou, E	en
dc.contributor.author	Tolias, G	en
dc.contributor.author	Mylonas, P	en
dc.contributor.author	Avrithis, Y	en
dc.date.accessioned	2014-03-01T01:30:01Z
dc.date.available	2014-03-01T01:30:01Z
dc.date.issued	2009	en
dc.identifier.issn	1380-7501	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/19451
dc.subject	Concept detection	en
dc.subject	Keyframe extraction	en
dc.subject	Region types	en
dc.subject	Visual thesaurus	en
dc.subject.classification	Computer Science, Information Systems	en
dc.subject.classification	Computer Science, Software Engineering	en
dc.subject.classification	Computer Science, Theory & Methods	en
dc.subject.classification	Engineering, Electrical & Electronic	en
dc.subject.other	Information theory	en
dc.subject.other	Vectors	en
dc.subject.other	Concept detection	en
dc.subject.other	Detection performances	en
dc.subject.other	Key frames	en
dc.subject.other	Keyframe extraction	en
dc.subject.other	Keyframe selections	en
dc.subject.other	Latent Semantic Analysis	en
dc.subject.other	Material informations	en
dc.subject.other	Model vectors	en
dc.subject.other	Region types	en
dc.subject.other	Selection processes	en
dc.subject.other	Texture descriptors	en
dc.subject.other	Very large datums	en
dc.subject.other	Video analysis	en
dc.subject.other	Video shots	en
dc.subject.other	Visual thesaurus	en
dc.subject.other	Thesauri	en
dc.title	Concept detection and keyframe extraction using a visual thesaurus	en
heal.type	journalArticle	en
heal.identifier.primary	10.1007/s11042-008-0237-9	en
heal.identifier.secondary	http://dx.doi.org/10.1007/s11042-008-0237-9	en
heal.language	English	en
heal.publicationDate	2009	en
heal.abstract	This paper presents a video analysis approach based on concept detection and keyframe extraction employing a visual thesaurus representation. Color and texture descriptors are extracted from coarse regions of each frame and a visual thesaurus is constructed after clustering regions. The clusters, called region types, are used as basis for representing local material information through the construction of a model vector for each frame, which reflects the composition of the image in terms of region types. Model vector representation is used for keyframe selection either in each video shot or across an entire sequence. The selection process ensures that all region types are represented. A number of high-level concept detectors is then trained using global annotation and Latent Semantic Analysis is applied. To enhance detection performance per shot, detection is employed on the selected keyframes of each shot, and a framework is proposed for working on very large data sets. © 2008 Springer Science+Business Media, LLC.	en
heal.publisher	SPRINGER	en
heal.journalName	Multimedia Tools and Applications	en
dc.identifier.doi	10.1007/s11042-008-0237-9	en
dc.identifier.isi	ISI:000262506300002	en
dc.identifier.volume	41	en
dc.identifier.issue	3	en
dc.identifier.spage	337	en
dc.identifier.epage	373	en