You are currently browsing the category archive for the 'cmu' category.
This is research I did a while ago and presented Monday to fulfill the requirements of my Masters degree. The presentation only needed to be about 20 minutes, so it was a very short intro. We have moved on since then, so when I say future work, I really mean future work. The post is rather lengthy, so I have moved the main content below the jump.
I just completed the final requirements of my Masters degree today (the details of which I will save for a future post). It has been a difficult road, and I’m glad it’s done. I didn’t attend any sort of graduation ceremonies, because I don’t go for that sort of thing — at all. Until today, it didn’t feel like the weight was off my shoulders. Now I actually feel like celebrating! But I won’t, because I’m a nerd. I’m currently celebrating by working on a programming puzzle. And surfing the blagoblag.
I still have a couple months of servitude to complete the requirements of my fellowship, but the degree is mine.
Today is the official opening day of GWAP: Games with a Purpose. This is one of two research projects I have been working on for the past few months, though my involvement with GWAP so far has only been in the form of attending meetings, minor testing, and offering my sage gaming advice (and by sage, I mean the herb). GWAP is the next phase in Luis von Ahn’s human computation project. If you visit and play some games, not only will you be rewarded with a good time, but you’ll be helping science! Science needs you. To play games. Now.
The Idea
Artificial intelligence has come a long way, but humans are still far better at computers at simple, everyday tasks. We can quickly pick out the key points in a photo, we know what words mean and how they are related, we can identify various elements in a piece of music, etc. All of these things are still very difficult for computers. So why not funnel some of the gazillion hours we waste on solitaire into something useful? Luis has already launched a couple websites that let people play games while solving these problems. Perhaps you’ve noticed the link to Google Image Labeler on Google Image Search? That idea came from his ESP game (which is now on GWAP).
The Motivation
What researchers need to help them develop better algorithms for computers to do these tasks is data. The more data the better. Statistical machine translation has improved quite a bit over the past few years, in large part due to an increased amount of data. This is the reason why languages that are spoken by few people (even those spoken by as few as several million) still don’t have machine translation tools: there is just not enough data. More data means more food for these algorithms which means better results. And if results don’t improve, then we have learned something else.
The Solution
Multiple billions of hours are spent each year on computer games. If even a small fraction of that time were spent performing some task that computers aren’t yet able to do, we could increase the size of the data sets available to researchers enormously. Luis puts this all a lot better than I can, and fortunately, you can watch him on YouTube (below).
So, check it out already.
I attended some of the final presentations of an undergrad class on Game Programming today with a friend. We went in expecting something more like a poster session, where people are arrayed around a room showing their work off to a few people who managed to crowd around them. The poster session is ideal for brief browsing, because you can skip anything you’re not interested in. Instead, it was a series of power point presentations followed by an on-screen demo.
I’ve been messing around with recommender systems for the past year and a half, but not using the kNN (k-Nearest Neighbors) algorithm. However, my current homework assignment for my Information Retrieval class is to implement kNN for a subset of the Netflix Prize data. The data we are working with is about 800k ratings, which is slightly smaller than the MovieLens dataset, which was the previous dataset of choice for research on movie recommender systems. The entire Netflix data set dwarfs the MovieLens set by a factor of 100, so it is quickly replacing MovieLens in papers. The Netflix data is much sparser than the MovieLens, which changes things, as well.
kNN is a fairly simple machine learning algorithm to implement. On a dataset the size of Netflix, it’s still easy to do stupid things that cause it to take forever. Recommender systems typically match users to movies on a scale, which is the user’s rating for that item. In the case of Netflix, the scale is 1 (hate) to 5 (love). For the Netflix Prize, the goal is to guess user’s ratings on a hidden set as correctly as possible (according to root mean squared error (RMSE)). One way of approaching the problem is to create a user-items matrix where the rows correspond to a user, the columns to an item (movie) and the value in each cell is the user’s rating for that item. If the user has not rated the item, it is assigned a zero. Now, we can split this matrix up into vectors, where each row vector represents a user. kNN seeks to find similar users to other users (or similar movies to other movies) according to some metric over these vectors. I won’t bother going into the metrics in detail, but they include cosine similarity, Euclidean distance, and Pearson correlation. The ratings from the chosen k users are combined (either by simply averaging or using some weighted average) to form the prediction for a movie the user has not yet rated.
So on the test dataset for this assignment, I built three models that had the following RMSE scores:
| Model 1 | 0.9831 |
| Model 2 | 1.0371 |
| Model 3 | 0.9768 |
Just guessing the average rating for each movie gives an RMSE of 1.0 in this case, so Models 1 and 3 improve over the baseline, while Model 2 does worse. The best performing teams in the Netflix prize use ensemble methods to combine various models. The simple way to do this is just with a linear combination. So given models {m1, m2, m3} and weights {w1, w2, w3}, the ensemble prediction would be w1m1 + w2m2 + w3m3 (where w1+w2+w3=1.0). This was the first time I had tried ensembles with recommender systems as well, so imagine my surprise when I hit 0.9469 RMSE with my best choice of w’s. Of course, this is nowhere near the number needed to actually claim the prize, but it was a nice demonstration of the power of ensemble methods. I recommend checking out the proceedings of last year’s KDD Cup if you’re interested.
OpenEphyra is a question answering (QA) system developed here at the Language Technologies Institute by Nico Schlaefer. He began his work at the University of Karlsruhe in Germany, but has since continued it at CMU and is currently a PhD student here. Since it is a home-grown language technologies package, I decided to check it out and play around. This is the first QA system I have used that wasn’t integrated in a search engine, so this isn’t exactly an expert review.
Getting started in Windows (or Linux or whatever) is pretty easy if you already have Apache ant and Java installed. Ant isn’t necessary, but I recommend getting it if you don’t have it already. It’s just handy. First, download the OpenEphyra package from sourceforge. The download is about 59 MB and once it’s done unpack it in whatever directory you want. Assuming you have ant installed, all you have to do is type ant to build it, though you may want to issue ant clean first. I had to make one slight change to the build.xml file to get it to run, which was on line 55: <jvmarg line="-server& #13;-Xms512m& #13;-Xmx1024m"/>, which had to be changed to <jvmarg line="-server -Xms512m -Xmx1024m"/>. Easy enough. Then to run it, all you have to do is type ant OpenEphyra.
After taking a short bit to load up, you can enter questions on the command line. Based on what I can tell from the output, it begins by normalizing the question (removing morphology, getting rid of punctuation). Then it determines the type of answer it is looking for, like a person’s name or a place and assigns certain properties to what it expects to find. Next it automatically creates a list of queries that are sent to the search engine(s). The documentation indicates that the AQUAINT, AQUAINT-2 and BLOG06 corpora are included (at least preprocessing is supported), but there are searchers for Google, Wikipedia, Yahoo and several others. Indri is a search engine which supports structured queries and OpenEphyra auto-generates some structured queries from what I saw playing around. After generating the queries, they are sent to the various searchers and results are obtained and scored. Finally, if you’re lucky, you get an answer to your question.
Here are the results of screwing around with it for a few minutes:
- Who created OpenEphyra?
- no answer (sorry, Nico)
- Who invented the cotton gin?
- Eli Whitney
- Who created man?
- God
- What is the capital of Mongolia?
- Ulaanbaatar
- Who invented the flux capacitor?
- Doc Brown (awesome!)
- Who is the author of the Mendicant Bug?
- Zuckerberg — damn you, Facebook! :(
- How much wood can a woodchuck chuck?
- no answer (correct)
- What is the atomic number of Curium?
- 96 (also correct)
- Who killed Lord Voldemort?
- Harry (correct, but partial)
- How many rings for elven kings?
- 3021 (so, so very wrong)
Fun stuff! It’s not anywhere near perfect, but there are definite uses and the thing is ridiculously easy to install and use. Also, it’s in Java, so you can integrate it with your own system with very little effort. Depending on what sort of question you are looking for answers to, you get various levels of results. Factual questions about geography and people tend to do better than questions about numbers and fiction, as you might expect. Also, why-questions are not supported.
Another bonus is the project is open source, so if you’re into QA, you can help develop it.
I go through my spam everyday to make sure that false positives don’t get deleted. For whatever reason, stuff coming from the Help Desk at CMU gets labeled as spam a lot. I’m not saying it sounds like word salad (*cough*), but it trips off gmail’s spam sensors. The good thing about gmail is a low false negative rate, the bad thing is a fairly high false positive. And if you weren’t already aware, word salad is the name given to the jumble of unrelated, often obscure words that appear in a spam email to throw off spam filters.
The various spam messages I get never fail to amuse me in some way, so why not share them with you, my innocent reader would rather never see another spam title again? Ages ago, I was especially amused by two bits of spam that actually had lines from Robert Jordan’s Wheel of Time series as subject lines. I captured an image of the second one, but the first is lost forever and I haven’t noticed one since (click on it if it’s too small to read).
So the inaugural Spam of the Day (SOTD, rhymes with sotted):
“Try the new manpower candy!”
Ed Clarke, a professor of Computer Science at CMU, just won the 2007 ACM Turing Award. The ACM is the Association for Computing Machinery and is the oldest professional group for the computing industry. I first became a member in 2005 and have maintained that membership since. The Turing Award is given in honor of Alan Turing, the father of computer science (most would agree). This award is basically the Nobel prize of computer science (since they don’t give Nobels for CS) and is meant to recognize individuals who have made a lasting and significant contribution to the computing field.
Ed’s work was in conjunction with two other people: E. Allen Emerson and Joseph Sifakis. Their work was on model checking, which is a way of determining whether a hardware or software structure is a model of a logical formula. So if a structure matches a formula in propositional logic, it checks.
Clarke joins three other professors at CMU who are Turing recipients. Raj Reddy was co-awarded it in 1994 for large scale AI systems. Manuel Blum won it in 1995 for his work on computational complexity theory. Dana Scott won it in 1976 for non-deterministic finite state machines, something that has a major role in natural language processing (and computational linguistics).
The snow continues here. Today it was covering the road. When I took Daedalus and Willow out in the morning, they both weren’t having any of it. Daedal balked at the door and Willow was stepping gingerly and obvious wondering what had gone wrong with the world. They finally got used to it, though the poor boy was shivering his butt off after a short while. Willow was more in her element. We’re gonna have to get him paw gloves.
Here is the scene out my office window. They are currently building the new Computer Science Complex here. One of the buildings is the Gates Center. You can actually see shots from the live webcam 24/7, though the show is quite boring after about 4:30pm or so these days. What’s amazing to me is that people are out there working right now. In South Carolina, construction work ended as soon as the sky darkened and rain fell. If snow fell, it would be like the end of the world had come. I don’t think this makes them any faster, though. Another bizarre difference between construction crews here and there is that there are no hispanic people here. This is a very bad thing as it also means there’s crap for Mexican food. You could find good Mexican every time you turned around in SC.

I just attended a wedding yesterday and have been without Internet for way too long. Interestingly, my last post became wildly popular on both digg and stumbleupon. That post has more than doubled the total traffic of all time to my blog, bringing it to over 10k hits. As of the writing of this post, the post has gotten 4,060 hits today alone, about 550 yesterday and 990 Friday. Crazy. Thank you, whoever put up that sign. On a side note to anyone who might be wondering, CMU uses blue cannisters for recycling. There isn’t normally a cannister in that location and people were standing around it the past couple days before it, so anyone who passes through there on a regular basis would know not to use it. Some funny comments on the post and on digg.
Anyhow, at the wedding, the best man gave the traditional toast. The couple being married had dated for about 10 years, so it was a pretty special wedding to a lot of people. The best man made the point that, in marriage, it’s easy to fall in love with somebody, but a happy marriage comes from growing that love. As your love grows, you’ll look back and see your wedding day as the day you loved each other the least. He concluded with the following line (my paraphrase since I can’t remember exactly):
“May the best day of your past be the worst day of the rest of your life.”
It got a round of laughs, but it’s also a great way to put it.
So yesterday was the big race for the DARPA Urban Challenge. The goal is to research technologies that will lead to autonomous battlefield robots that can deliver supplies while navigating traffic. The joint Carnegie Mellon and General Motors team won, completing the race with no major traffic infractions. This strikes me as one of those technologies that in 20 years no one will realize had military origins. We’ll all happily get in our inexpensive robotic taxis running on electricity.

I mentioned TGs in a previous post. To refresh your memory, they are the school of computer science’s big departmental parties (actually thrown by the student organization, Dec/5). The name has been generalized to any party thrown in the department. Today’s was the Smiley TG, celebrating the birthday of the smiley, invented by Scott Fahlman, a professor in my department. Planned as well as all things are, today is also the annual picnic for my department! So I went to the TG, where I was almost completely without friends. So I drank a Dogfish Head 60 Minute IPA and left. But not before grabbing some swag and snapping a pic of the pandemonium.


A professor in my department (primarily affiliated with the LTI and the CSD, but also the MLD and HCII) invented the smiley 25 years ago on a bboard here at CMU. The fateful message that spawned the smiley is reproduced below [source]:
19-Sep-82 11:44 Scott E Fahlman :-)From: Scott E Fahlman <Fahlman at Cmu-20c>I propose that the following character sequence for joke markers::-)Read it sideways. Actually, it is probably more economical to markthings that are NOT jokes, given current trends. For this, use:-(
In honor of the 25th birthday of the smiley, the CS department is holding a TG where Scott will inaugurate an annual Smiley Prize. I’m probably not going to be able to make that, though, on account of other engagements.
The 9th annual Conference on Intelligent Text Processing and Computational Linguistics is calling for papers. The conference will be held in Haifa, Israel from February 17-23, 2008. Two of the keynote speakers were professors of mine last semester. Alon Lavie works on machine translation (including for low-resource languages) and machine translation evaluation. Kemal Oflazer was a visiting professor here at CMU last semester and advised me for a lab project on Old English morphology. He’s one of the leading proponents of finite state technology for morphological analysis and has done a lot of work with Turkish, which has a very rich morphology.
Submission Deadline: October 15, 2007.
The full CFP is given below:
Tonight was the reception “TG” for new students of the Language Technologies Institute. TG is the general term given to parties thrown in the department of computer science at CMU. Typically TGs are thrown on Fridays by the student organization Dec/5. So TG as in TGIF.
So anyhow, I tried a new beer that I hadn’t heard of before called Leinenkugels Sunset Wheat. I was given a tip prior to coming to the TG that Leinenkugels was the mystery beer. After a quick search on ratebeer.com, I found that Leinenkugel has one good beer and a whole lot of crap. Smart money was on the crap being the pick. Unfortunately, this was correct. I even guessed the correct specimen of crap: the Sunset Wheat.
So I decided to taste it so I could add another beer to my rating list. My reaction was immediate. Here is the comment I posted on the ratebeer page for this beer:
“I was immediately struck by a similarity to Flintstones vitamins. It was so intense I exclaimed. This beer was really atrocious. On the up side, it didn’t linger very long on the palate. This is a drain pour.”
I stand by this assessment. After I posted my comment I perused the other comments to see if people agreed with me. What I found was great. A few of the tastier specimens I came across are below:
“There are aromas & flavors in this that just don’t belong in a beer. Tastes like the milk left over after eating a bowl of Fruit Loops was poured into my beer. And worse, the fruit taste is very artificial. Maybe it wouldn’t be bad on a hot day, but I think i’m going to avoid this one for now.
“If I got really drunk on beer Friday, woke up on Saturday and ate some Fruity Pebbles then threw up…this is what it would taste like.”
“On tap. Smell is 100% Fruity Pebbles Cereal. It is uncanny - as though it is liquid Fruity Pebbles. Cloudy golden appearance with no lastin head. Reminds me more of a Hefe than an American Wheat. Really distinctive orange flavor. Almost tastes like a candy beer. Very sweet taste.”
“not so good. Had a flavor of jelly beans totally nasty. Might be the worst beer Ive tasted”
Scanning just a few more pages of comments I found 8 more references to Fruity Pebbles. I think the key here is that distinct artificial fruit flavor they put in jelly beans, flintstones vitamins and that cereal. So yeah, it sucked.







