In my previous post about sentiment polarity, I talked about results from Pang et al (2002). One of the conclusions in that paper was that the presence of sentiment words led to better classification results than the frequency of words. In my experiment in that post, I used tf-idf, a frequency-based measure. I ran some additional experiments a few days ago when I woke up way too early using presence (binary) weights. The result was a slight improvement over tf-idf: 86.1% versus 85.7%. If we ignore document frequency and just use term frequency, the results were terrible: about 76%. So presence versus term frequency is much better, but presence versus tf-idf isn’t much better.
Or is it? Even more experiments with tf-idf produced an accuracy of 86.8%. All of this is based on 10-fold cross validation using the Pang and Lee (2004) data set, just so we’re clear. This seems to contradict their results. Of course, I wasn’t able to reproduce their results identically, even though I am using the folds exactly as they described. This may be due to a pre-processing step I am skipping (or doing extra). They mention length-normalizing the vectors, which I don’t usually bother with. It’s an oft-suggested thing to do with svms, but I have yet to have it actually help me.
So I tried normalizing. It hurt results for tf-idf, dropping it to 86.6%. It made no difference for presence, which stayed at 86.1%. No surprises there.
My results contradict Pang et al (2002) in that tf-idf (frequency-based) out-performs presence. If I made a mistake, where was it? I wish their source code were made available. I guess I could always ask. There is usually some voodoo involved that isn’t obvious (to me) in the paper. This is a-whole-nother topic, one discussed with far more eloquence (pdf warning) by Ted Pedersen in the latest issue of Computational Linguistics.
References
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. “Thumbs Up? Sentiment Classification Using Machine Learning Techniques.” In Proceedings of the ACL 02 conference on Empirical Methods in Natural Language Processing – Volume 10, July 2002. [pdf]
Bo Pang and Lillian Lee. “A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts.” In Proceedings of the ACL, 2004. [pdf]



[...] October 2008 in computational linguistics Well to answer the question I posed in my last post: presence is indeed better than frequency! My previous experiments led me to the opposite [...]
You should probably check one of the older papers about document classification (I think it was by D. Lewis in the 90s)?
If I understand correctly, sentiment classification is like document classification in some sense, same input, only the output is not exactly a topic, but the polarity of the document.
Anyway, in that paper they give overview of all kind of representations (tf-idf, frequency, presence, etc.) and *maybe* explain why or when one might be better than the other…
By the way – I just tried to post a comment, and I did not enter my details. The error made me type the whole comment again (when I clicked back on the browser)! :) something for your attention…
In my experiments, presence seems to be outperforming tf-idf too. ( 87% vs. 85 ) But, I have not normalized tf-idf.