Content vs. context for sentiment analysis: A comparative analysis over microblogs

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Aisopos, F en
dc.contributor.author Papadakis, G en
dc.contributor.author Tserpes, K en
dc.contributor.author Varvarigou, T en
dc.date.accessioned 2014-03-01T02:53:35Z
dc.date.available 2014-03-01T02:53:35Z
dc.date.issued 2012 en
dc.identifier.uri http://hdl.handle.net/123456789/36434
dc.subject N-gram graphs en
dc.subject Sentiment analysis en
dc.subject Social context en
dc.subject.other Classification methods en
dc.subject.other Comparative analysis en
dc.subject.other Content-based features en
dc.subject.other Context-based en
dc.subject.other Dimensionality reduction en
dc.subject.other Discretizations en
dc.subject.other Extraction costs en
dc.subject.other Inherent characteristics en
dc.subject.other Micro-blog en
dc.subject.other Multiple Classification en
dc.subject.other N-gram graphs en
dc.subject.other Noise-Tolerant en
dc.subject.other Real world data en
dc.subject.other Sentiment analysis en
dc.subject.other Social context en
dc.subject.other Time efficiencies en
dc.subject.other Traditional techniques en
dc.subject.other Hypertext systems en
dc.subject.other Virtual reality en
dc.subject.other Data mining en
dc.title Content vs. context for sentiment analysis: A comparative analysis over microblogs en
heal.type conferenceItem en
heal.identifier.primary 10.1145/2309996.2310028 en
heal.identifier.secondary http://dx.doi.org/10.1145/2309996.2310028 en
heal.publicationDate 2012 en
heal.abstract Microblog content poses serious challenges to the applicability of traditional sentiment analysis and classification methods, due to its inherent characteristics. To tackle them, we introduce a method that relies on two orthogonal, but complementary sources of evidence: content-based features captured by n-gram graphs and context-based ones captured by polarity ratio. Both are language-neutral and noise-tolerant, guaranteeing high effectiveness and robustness in the settings we are considering. To ensure our approach can be integrated into practical applications with large volumes of data, we also aim at enhancing its time efficiency: we propose alternative sets of features with low extraction cost, explore dimensionality reduction and discretization techniques and experiment with multiple classification algorithms. We then evaluate our methods over a large, real-world data set extracted from Twitter, with the outcomes indicating significant improvements over the traditional techniques. Copyright 2012 ACM. en
heal.journalName HT'12 - Proceedings of 23rd ACM Conference on Hypertext and Social Media en
dc.identifier.doi 10.1145/2309996.2310028 en
dc.identifier.spage 187 en
dc.identifier.epage 196 en

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record