NLP Resources for Ruby

There are quite a few well-known libraries for doing various NLP tasks in Java and Python, such as the Stanford Parser (Java) and the Natural Language Toolkit (Python).  For Ruby, there are a few resources out there, but they are usually derivative or not as mature.  By derivative, I mean they are ports from other languages or extensions using code from another language.  And I’m responsible for two of them! :)

There are also a number of fledgling or orphaned projects out there purporting to be ports or interfaces for various other libraries like Stanford POS Tagger and Named Entity Recognizer.  Ruby (straight Ruby, not just JRuby) can interface just about any Java library using the Ruby Java Bridge (RJB).  RJB can be a pain, and I could only initialize it once per run (a second attempt never succeeds), so there are some limitations.  But using it, I was able to easily interface with the Stanford POS tagger.

So while there aren’t terribly many libraries for NLP tasks in Ruby, the availability of interfacing with Java directly widens the scope quite a bit.  You can also incorporate a c library using extensions.

Naturally, if I missed anything, no matter how small, please let me know.

4 responses to this post.

  1. [...] NLP Resources for Ruby « The Mendicant Bug mendicantbug.com/2009/09/13/nlp-resources-for-ruby – view page – cached Posted by Jason Adams in computational linguistics, java, natural language processing, nlp, parsers, python, ruby, stemmers, wordnet. Leave a Comment — From the page [...]

    Reply

  2. Thank you so much for compiling this list. I just discovered your blog and will spend a lot of time reading it in the next few months :)

    – Thibaut

    Reply

  3. Posted by Александр on 23 September 2009 at 22:13:22

    Сайт очень качественный. Вручить бы Вам награду за него или просто почетный орден. =)

    Reply

Respond to this post