HEAL DSpace

Video event detection and summarization using audio, visual and text saliency

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Evangelopoulos, G en
dc.contributor.author Zlatintsi, A en
dc.contributor.author Skoumas, G en
dc.contributor.author Rapantzikos, K en
dc.contributor.author Potamianos, A en
dc.contributor.author Maragos, P en
dc.contributor.author Avrithis, Y en
dc.date.accessioned 2014-03-01T02:46:34Z
dc.date.available 2014-03-01T02:46:34Z
dc.date.issued 2009 en
dc.identifier.issn 15206149 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/32724
dc.subject Audio en
dc.subject Movie summarization en
dc.subject Multimodal saliency en
dc.subject Text processing en
dc.subject Video en
dc.subject Video abstraction en
dc.subject.other Audio en
dc.subject.other Movie summarization en
dc.subject.other Multimodal saliency en
dc.subject.other Video en
dc.subject.other Video abstraction en
dc.subject.other Abstracting en
dc.subject.other Acoustics en
dc.subject.other Embedded systems en
dc.subject.other Mathematical operators en
dc.subject.other Motion pictures en
dc.subject.other Signal processing en
dc.subject.other Text processing en
dc.subject.other Video recording en
dc.subject.other Word processing en
dc.subject.other Signal detection en
dc.title Video event detection and summarization using audio, visual and text saliency en
heal.type conferenceItem en
heal.identifier.primary 10.1109/ICASSP.2009.4960393 en
heal.identifier.secondary http://dx.doi.org/10.1109/ICASSP.2009.4960393 en
heal.identifier.secondary 4960393 en
heal.publicationDate 2009 en
heal.abstract Detection of perceptually important video events is formulated here on the basis of saliency models for the audio, visual and textual information conveyed in a video stream. Audio saliency is assessed by cues that quantify multifrequency waveform modulations, extracted through nonlinear operators and energy tracking. Visual saliency is measured through a spatiotemporal attention model driven by intensity, color and motion. Text saliency is extracted from part-of-speech tagging on the subtitles information available with most movie distributions. The various modality curves are integrated in a single attention curve, where the presence of an event may be signified in one or multiple domains. This multimodal saliency curve is the basis of a bottom-up video summarization algorithm, that refines results from unimodal or audiovisual-based skimming. The algorithm performs favorably for video summarization in terms of informativeness and enjoyability. ©2009 IEEE. en
heal.journalName ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings en
dc.identifier.doi 10.1109/ICASSP.2009.4960393 en
dc.identifier.spage 3553 en
dc.identifier.epage 3556 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής