Archive for December, 2008

Top posts of 2008

Posted: 31 December 2008 in Uncategorized
Tags: , , , ,

Looking back over 2008, there have been a lot of changes in my life. Many of those are reflected in my blog, but few are reflected in the posts that have gotten the most traffic. But for the hell of it, here are the top posts anyway.

Post Hits in 2008
Old English Translator 10,589
Christmas Tree 2007 4,393
Steampunk Death Star 1,362
Salad Fingers 8 1,108
10 Reasons to Use Git for Research 1,032
Merge sort fun 777
The Noob’s Guide to Parsing 774
Java Properties 759
Ambigrams 719
Substitution Ciphers 680

Of all of those posts, the best one is hands down 10 Reasons to Use Git for Research. After that, the Noob’s Guide to Parsing. Some of the posts with the most hits are just link-sharing, where I saw something cool (Salad Fingers, Steampunk Star Wars, Ambigrams) and then other people found my link first.  One definite change on this blog was a decrease in the frequency of my posts.  Around the end of last year, I was posting close to 2 items per day.  Now it has stretched out to about 2 items per week.  Maybe I’ll reflect more on that later.

I’ll leave you with these thoughts.

Television shows seldom get computer stuff right, so I shouldn’t be surprised.  But then I heard this humdinger on CSI New York during the 1 minute I was watching it.  After I simultaneously guffawed and snorted in derision, I changed the channel.

According to the somewhat suspect Definitions.net (suspect by default, since I haven’t evaluated it otherwise):

1. (noun) computational linguistics
the use of computers for linguistic research and applications

This particular definition came to my attention thanks to a Google alert and I thought it was about the shortest definition of computational linguistics I’ve ever seen. It might not be a half bad definition for telling friends and family what you do when you don’t want to see them go all glassy-eyed and start drooling on themselves. It’s certainly not a satisfying definition, though.

I happened on clerk dogs, a new movie recommender, the other day.  They are still in beta and are missing data in many key areas of film, but they are definitely worth checking out.  Like Pandora, clerk dogs uses human editors to classify movies along several dimensions.  Indeed, the founder Stuart Skorman (also founder of Reel.com) calls it the movie genome project.  Of course, another movie recommender (also, still in beta) is using that term.  Stuart goes on to say:

We have designed this innovative search engine for the movie buffs who have seen so many movies that they’re having a hard time finding new ones (or old ones) that they will really love. I hope you find hundreds of great movies!

Brazil

Brazil

This is a problem I’ve been noticing with Netflix lately.  I would be pretty sure I’ve seen every sci-fi movie worth seeing that has been released if all I had to go on was Netflix’s recommendations.  I gave clerk dogs a shot, starting with my favorite movie.  They seem to have done a decent job with classifying Brazil and a number of the similar movies they have listed are indeed similar in many ways to it.  When I first visited the site, they showed the similar movies on a grid and said whether it was “more dark”, “less disturbing”, “more violent”, and so on.  If that functionality still exists, I can’t find it.

However, you can “Mash it” to find movies that fit your mood.  Pick your base movie and mash it.  Then change the sliding scale to decide what sort of differences you are looking for.  Can you say kickass?

I applaud clerk dogs for a job well done.  I’ve already found a number of movies that Netflix was hiding from me.  I added them to my Netflix queue though so I guess they are still benefitting.

Nerd God

Posted: 13 December 2008 in Uncategorized
Tags: , ,


I am nerdier than 100% of all people. Are you a nerd? Click here to find out!

No comment.

My Christmas Wishlist

Posted: 12 December 2008 in Uncategorized
Tags: , ,

Just a few books and stuff I wouldn’t mind getting for Christmas.  Just sayin’.

Note:  sort by priority.

ReadWriteWeb has a post on Forrester Research’s study about consumer trust of information sources.  It puts corporate and personal blogs at the very bottom (with 16% and 18% trust respectively), with personal email from a friend coming in at number one (with 77% trust).  Forrester suggests that corporate blogs shut down shop unless their blog is doing a good job of generating good will and/or leads.

This study bothers me on many levels.  As Michael Bernstein points out in the comments:

“Trust” is a 4 or a 5 on a 5 point scale, that is, anything above neutral. This means that lots of people could slightly trust a source and it would show up above something which a smaller number of people trust quite a bit and others are neutral on.

Also, the study compares information sources like email from friends and social networking profiles of friends to corporate and personal blogs. I ranted about this a bit on The Noisy Channel, which I’ll just reproduce here:

Comparing “personal blog” or some random “corporate blog” to “personal email sent from a friend” is pretty much like comparing “advice from gin-soaked hobo” to “what your mama always said.” The fact that Forrester can get away with presenting something like this and suggesting businesses act on it to shut down their blogs bothers me. It seems to me that 16-18% trustworthiness is not bad when you consider that much of the time you do a Google search for some product you hit a splog. That’s probably the only experience 80% of people have with blogs. Of course, that’s wild speculation, but this straw man study has gotten under my skin. :P And I do acknowledge that there is a huge amount of untrustworthy information in blogs, but I’m not sure that it’s much different from other user-generated content.

I agree that corporate blogs that are just reproductions of press releases (as Daniel Tunkelang at the Noisy Channel points out) are garbage. That is the wrong way to run a corporate blog. Google has a very good approach. They promote work they are doing by getting employees to blog about their personal projects (at least the Google blogs I read, there are surely exceptions). It comes across as real and beneficial. The value is that they keep you up-to-date on what they are doing with actual content. When that changes to become shameless promotion and unveiled attempts to drive sales, the blog is going to suck. GitHub’s blog is a another good example of a corporate blog done right.

Moving on, Daniel Tunkelang again offers some useful insight:

I think the interesting question for companies is not whether they should publish corporate blogs, but rather whether they should encourage their employees to publish personal blogs that relate to the work the company does. … I think that companies are often too conservative, and incur an enormous opportunity cost in the name of protecting trade secrets. Letting employees blog (and, more generally, publish) not only provides the companies with free marketing, but also provides employees with an avenue for personal development.

My cynicism prevents me from getting my hopes up here, but that would be nice.

Dogs don’t want to be left out

Posted: 9 December 2008 in Uncategorized
Tags: , ,

A recent finding by a University of Vienna team shows that dogs have a sense of fairness when it comes to getting treats. If you treat one dog in the presence of another and don’t treat that dog, it knows you did it wrong.  Yep, no surprise there.

And is it just me or is the dog in that article really a werewolf?

Hal Daume has a nice post that deals with credit in academia among other things.

What I took away from this comment is essentially the realization that we are all working toward some vague future goal, which has to do with computationalizing language processing (or some other topic, for the non-NLP audience). Progress is good. If I’ve done work that has something interesting and novel to say about this goal, then it’s not bad — and is often good — that this builds on and improves on your work.

So one illusion people have about science is that it is advanced by giant leaps.  An Einstein comes along and revolutionizes science.  If you live your life with this ambition, you will almost certainly end in disappointment.  Most advancements are small hops forward and are often multiply discovered.  Believing that you will somehow be the next Einstein will probably have the opposite effect in your life.

I am becoming more and more convinced that intelligence is a matter of hard work, dedication, and interest.  I’ve seen some pop science reporting that a growing body of research supports praising kids for their effort rather than telling them how special they are.  I can’t find the link at the moment, but that rings true to me.  I was always told I was the smartest person in the world as a child, and I think that made me intellectually lazy.  It took some pretty serious life mistakes to learn that so-called intelligence is more about effort.

Pernicious Spam

Posted: 4 December 2008 in Uncategorized
Tags: , , ,

The spammers have been working hard to infiltrate Facebook.  I just got this (below) today, and it tripped my mental spam alarm.  These sorts of messages were commonplace on Friendster.  I would get messages from girls with near-pornographic profile pictures wanting to chat or asking me inane questions like which was the better hair color.  This is more insidious.

Insidious Facebook spam.

Insidious Facebook spam.

And for the record, I went to USC 2 ½ years ago.