Posts Tagged ‘memes’

No this isn’t a list of the top 25 memes of all time. It’s a post inspired by the “Top 25″ memes that have been going around Facebook, MySpace, the intarwebs, etc. These memes are fairly simple, as many are. List your top 25 favorite albums in your collection. List your top 25 books. List your top 25 experiences with acne. The lists go on. But, how would you go about producing such a list? Do you just have the top 25 albums in your collection ranked and ready in your mind? Or do you have a crapload of albums and figuring out the top 25 would be a non-trivial task?

A friend approached me with this very problem and he brought up using binary comparisons. I have talked about binary comparisons when dealing with machine translation evaluation before, and many in the academic community have begun to appreciate their value for eliciting human preferences. Performing evaluations using binary comparisons is straightforward: give the evaluator pairs of items and ask which is better according to some subjective metric. In the case of machine translation evaluation, the task is to decide which translation is more human-like.

It ought to be obvious that a major problem with binary comparisons in a data set of any appreciable size is the massive number of comparisons you will need in order to rank every item in the set. For the sake of argument, let’s assume that your comparisons never contradict themselves. Under our assumption, the following does not occur: A is rated better than B, B is rated better than C, and C is rated better than A. This lets us bound the amount of comparisons to O(n log n), since we are basically dealing with a sorting algorithm using the human as the comparison operator. This was my friend’s idea and I really liked it.

But let’s say you have 1000 albums. 1000 * log (1000) = 9966 comparisons! That’s just too much work. Maybe we ought to just rate them on a scale of 1-100 (or whatever) and order them using that. It’s not perfect, but it’s ten times less work. But what if you were dedicated and actually wanted a good ranking?

Binary comparisons lend themselves to binary trees and a binary search tree would be ideal, so heapsort seems like a natural sorting algorithm. But we’re still stuck with O(n log n) to build it. Returning to the idea of “top 25″ memes, we don’t need the ordering of the entire set, just the top 25 items, so we can surely get rid of some of the work.

So rather than doing a full heapsort algorithm, I figured why not use an AVL tree. We don’t need to keep the entire tree balanced — we only need to worry about the right branch off the root which can be limited to 25 nodes. The algorithm is fairly simple (pseudocode):

  • for each item:
  • compare item to root node:
  • if item < root then
    • discard
  • else
    • add to right branch
    • balance the right branch
    • if the number in the right branch exceeds 25, replace the root node with the lowest value and delete that node.

This greatly reduces the number of comparisons needed since you very quickly bubble the best items into the top 25 branch. When a new item is being added to the top 25, only 4-5 comparisons need be made to determine its place. By continuously balancing the right branch, we can keep the height of that side as low as possible on average in order to prevent too many comparisons when new items are added to it.

I couldn’t figure out an easy way to compute the big O for this, so I ran some simulations instead. For a set of 1000 items, on average the user will have to perform 1528 +/- 2 comparisons to produce the top 25 items.

The code for the tree can be found in this gist. The nasty $counter global is just there to help me count the number of comparisons that are being performed for the purposes of the simulation. Some of the AVL Tree code was inspired by Bruno Preiss’s book on algorithms and data structures.

Top 48 Sci-Fi Adaptations

Posted: 26 August 2008 in Uncategorized
Tags: , , , ,

I just saw this meme on Pat’s Fantasy Hotlist and since he didn’t tag anyone, and I’ve never done this sort of thing before, I figured what the heck.  Just as he tagged no one, nor will I.  This is for my fleeting amusement and as an escape from the joys of moving.  The chain-letter-like aspect of “tagging” people is somewhat repulsive to me.

From Box Office Mojo’s list of Top 48 Sci-Fi Films Based on a Book (or Story) (1980- present). Some of the titles on this list look suspicious. (Was Cocoon really based on a piece of written fiction? There’s a difference between an adaptation and a novelization.)

Here are the rules.

- Copy the list below.
- Mark in bold the movie titles for which you read the book.
- Italicize the movie titles for which you started the book but didn’t finish it.
- Tag 5 people to perpetuate the meme. (You may of course play along anyway.)

And now, the list…

1. Jurassic Park
2. War of the Worlds
3. The Lost World: Jurassic Park
4. I, Robot
5. Contact
6. Congo
7. Cocoon
8. The Stepford Wives
9. The Time Machine
10. Starship Troopers
11. The Hitchhiker’s Guide to the Galaxy
12. K-PAX
13. 2010
14. The Running Man
15. Sphere
16. The Mothman Prophecies
17. Dreamcatcher
18. Blade Runner(Do Androids Dream of Electric Sheep?)
19. Dune
20. The Island of Dr. Moreau
21. Invasion of the Body Snatchers
22. The Iron Giant(The Iron Man)
23. Battlefield Earth
24. The Incredible Shrinking Woman
25. Fire in the Sky
26. Altered States
27. Timeline
28. The Postman
29. Freejack(Immortality, Inc.)
30. Solaris
31. Memoirs of an Invisible Man
32. The Thing(Who Goes There?)
33. The Thirteenth Floor
34. Lifeforce(Space Vampires)
35. Deadly Friend
36. The Puppet Masters
37. 1984
38. A Scanner Darkly
39. Creator
40. Monkey Shines
41. Solo(Weapon)
42. The Handmaid’s Tale
43. Communion
44. Carnosaur
45. From Beyond
46. Nightflyers
47. Watchers
48. Body Snatchers

I’ve read most of the books on that list that I’d consider worth reading.  Any recommendations for the ones I haven’t?

Parking Fail

Posted: 3 March 2008 in Uncategorized
Tags: , , , , ,

Perhaps you’ve seen the “FAIL” meme that’s going around the tubes. If not, search on google images for “epic fail” and you’ll hit it soon enough (I won’t link to them). Basically it’s a bunch of pictures of people screwing up or things built wrong or whatever. In short: failures. So tonight when we parked outside of a restaurant and our parking meter said this, I had to get a shot.

Parking FAIL

Perfect Major?

Posted: 18 November 2007 in Uncategorized
Tags: , , , , , , ,

The intertubes are full of quizzes. Magazines like Cosmo have thrived on them for years. “Are you a good lover?” Websites like Tickle pretty much consist of nothing else (and I haven’t bothered beyond the odd quiz someone sends me). Tons of Facebook apps like Flixster (movies) and Harry Potter rely on them heavily. One of my google alerts is for linguistics and I saw some random 14-year-old dude‘s blog post about his perfect major according to this quiz. My results are below the jump.

So of course everyone with sense knows these quizzes are pretty much random. However, they also collect a vast amount of data. What they don’t collect (usually) is actual information about the people who take their quizzes. Imagine if at the end of a quiz there was a question or two about the actual truth of the thing the quiz is predicting. What kind of lover are you? Well just ask! If the result is similar to the quiz results, you can gauge how well your quiz is classifying people. It may not produce scientifically valid results but it does produce results that are better than nothing. (more…)

What people hear

Posted: 1 October 2007 in Uncategorized
Tags: , , , , , ,

While on the topic of presentations, I came across this video in the archives of Presentation Zen and then again on Bad Astronomy the same day. Coincidence or some hidden memetic process?

I think it’s an awesome example of how the worst PowerPoint presentations actually come across: as messages with zero entropy (that is, no information).