Who created OpenEphyra?

Posted: 16 February 2008 in Uncategorized
Tags: , , , , , , , , ,

OpenEphyra is a question answering (QA) system developed here at the Language Technologies Institute by Nico Schlaefer. He began his work at the University of Karlsruhe in Germany, but has since continued it at CMU and is currently a PhD student here. Since it is a home-grown language technologies package, I decided to check it out and play around. This is the first QA system I have used that wasn’t integrated in a search engine, so this isn’t exactly an expert review.

Getting started in Windows (or Linux or whatever) is pretty easy if you already have Apache ant and Java installed. Ant isn’t necessary, but I recommend getting it if you don’t have it already. It’s just handy. First, download the OpenEphyra package from sourceforge. The download is about 59 MB and once it’s done unpack it in whatever directory you want. Assuming you have ant installed, all you have to do is type ant to build it, though you may want to issue ant clean first. I had to make one slight change to the build.xml file to get it to run, which was on line 55: <jvmarg line="-server& #13;-Xms512m& #13;-Xmx1024m"/>, which had to be changed to <jvmarg line="-server -Xms512m -Xmx1024m"/>. Easy enough. Then to run it, all you have to do is type ant OpenEphyra.

After taking a short bit to load up, you can enter questions on the command line. Based on what I can tell from the output, it begins by normalizing the question (removing morphology, getting rid of punctuation). Then it determines the type of answer it is looking for, like a person’s name or a place and assigns certain properties to what it expects to find. Next it automatically creates a list of queries that are sent to the search engine(s). The documentation indicates that the AQUAINT, AQUAINT-2 and BLOG06 corpora are included (at least preprocessing is supported), but there are searchers for Google, Wikipedia, Yahoo and several others. Indri is a search engine which supports structured queries and OpenEphyra auto-generates some structured queries from what I saw playing around. After generating the queries, they are sent to the various searchers and results are obtained and scored. Finally, if you’re lucky, you get an answer to your question.

Here are the results of screwing around with it for a few minutes:

  • Who created OpenEphyra?
    • no answer (sorry, Nico)
  • Who invented the cotton gin?
    • Eli Whitney
  • Who created man?
    • God
  • What is the capital of Mongolia?
    • Ulaanbaatar
  • Who invented the flux capacitor?
    • Doc Brown (awesome!)
  • Who is the author of the Mendicant Bug?
    • Zuckerberg — damn you, Facebook! :(
  • How much wood can a woodchuck chuck?
    • no answer (correct)
  • What is the atomic number of Curium?
    • 96 (also correct)
  • Who killed Lord Voldemort?
    • Harry (correct, but partial)
  • How many rings for elven kings?
    • 3021 (so, so very wrong)

Fun stuff! It’s not anywhere near perfect, but there are definite uses and the thing is ridiculously easy to install and use. Also, it’s in Java, so you can integrate it with your own system with very little effort. Depending on what sort of question you are looking for answers to, you get various levels of results. Factual questions about geography and people tend to do better than questions about numbers and fiction, as you might expect. Also, why-questions are not supported.

Another bonus is the project is open source, so if you’re into QA, you can help develop it.

  1. [...] when OpenEphyra is given the question What is the origin of the word deuce? the answer is “Watkins.”  [...]

  2. Josh says:

    Hmmm… My results are pretty mixed. I was a bit disappointed after seeing what it was able to answer for you:

    Who is the father of Luke Skywalker? Jedi
    What is the answer to life, the universe, and everything? Douglas Adams
    What color is Darth Vader’s helmet? Black (correct!)
    Who killed John Lennon? Mark David Chapman (correct!)
    How many heads does a hydra have? Iolaus (??)
    What is the next highest rank in the US Navy above Lieutenant? Admiral of the Fleet (expecting Lieutenant Commander or Commander – Fleet Admiral is, in fact, the very highest rank among Naval officers)
    Where is Amarillo? Texas (correct!)
    Who wrote “The Well-Tempered Clavier”? Bach (correct!)

  3. Jason Adams says:

    Who is Iolaus? Michael Hurst
    Who is Michael Hurst? Kevin Sorbo
    Who is Kevin Sorbo? Gene Roddenberry
    Who is Gene Roddenberry? Eugene Wesley
    Who is Eugene Wesley? Gene Roddenberry

    We found a cycle!

  4. Murtoza Habib says:

    I cannot run the program from my office. It cannot search from the web. I think some port is block from the server. Can you tell which port this software is using? Any way to change the port and use the software?

    Thanks in advance.


  5. Jason Adams says:

    Not sure. I didn’t see any obvious place for changing it, but it’s not my software (and I haven’t looked at the code in depth). Try asking on the sourceforge forum: http://sourceforge.net/forum/forum.php?forum_id=781452

    Good luck!

  6. Kartikeya says:

    It did better than I thought it would, but definitely far from being useful without supervision and/or sophisticated filtering.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s