Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification

Sajib Dasgupta and Vincent Ng.
ACL-IJCNLP 2009: Proceedings of the Main Conference, pp. 701-709, 2009.

Click here for the PostScript or PDF version. The talk slides are available here.

Abstract

Supervised polarity classification systems are typically domain-specific. Building these systems involves the expensive process of annotating a large amount of data for each domain. A potential solution to this corpus annotation bottleneck is to build unsupervised polarity classification systems. However, unsupervised learning of polarity is difficult, owing in part to the prevalence of sentimentally ambiguous reviews, where reviewers discuss both the positive and negative aspects of a product. To address this problem, we propose a semi-supervised approach to sentiment classification where we first mine the unambiguous reviews using spectral techniques and then exploit them to classify the ambiguous reviews via a novel combination of active learning, transductive learning, and ensemble learning.

BibTeX entry

@InProceedings{Dasgupta+Ng:09a,
  author = {Sajib Dasgupta and Vincent Ng},
  title = {Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification},
  booktitle = {ACL-IJCNLP 2009: Proceedings of the Main Conference},
  pages = {701--709},
  year = 2009
}