At the Atlanta Semantic Web Meetup tonight, Vishy Dasari gave us a quick description and demo of a new search engine called Semantifi. They purportedly are a search engine for the deep web, meaning the web that is not indexed by traditional search engines because the content is dynamic. They are just in the very [...]
Posts Tagged ‘computational linguistics’
Semantifi and the Deep Web
Posted: 6 February 2010 in UncategorizedTags: computational linguistics, natural language processing, search engines, search interfaces, semantic search engine, semantic web, wolfram alpha
NLP Resources for Ruby
Posted: 13 September 2009 in UncategorizedTags: computational linguistics, java, natural language processing, nlp, parsers, python, ruby, stemmers, wordnet
There are quite a few well-known libraries for doing various NLP tasks in Java and Python, such as the Stanford Parser (Java) and the Natural Language Toolkit (Python). For Ruby, there are a few resources out there, but they are usually derivative or not as mature. By derivative, I mean they are ports from other [...]
Lazyfeed: the missing link in the evolution of RSS?
Posted: 1 August 2009 in UncategorizedTags: computational linguistics, rss, information trapping, lazyfeed, exploratory search, topic detection, google alerts, invites, recommender system
When Lazyfeed announced a limited round of beta invites on TechCrunch, I admit, I lusted after them. Only 250? I wanted to be one! But alas, I was put on the waiting list. It’s a decent marketing strategy for building up some hype. When I finally did get my invite, I tried them out for [...]
Updates to lda-ruby gem
Posted: 30 July 2009 in UncategorizedTags: c, computational linguistics, latent dirichlet allocation, lda, machine learning, nlp, ruby, rubygems, topic modeling
A while back I ported David Blei’s lda-c code for performing Latent Dirichlet Allocation to Ruby. Basically I just wrapped the C methods in a Ruby class, turned it into a gem, and called it a day. The result was a bit ugly and unwieldy, like most research code. A few months later, Todd Fisher [...]
Porting the UEA-Lite Stemmer to Ruby
Posted: 16 July 2009 in UncategorizedTags: computational linguistics, finite state transducers, github, information retrieval, nlp, open source software, ruby, software, stemmers, stemming
A twitter friend (@communicating) tipped me off to the UEA-Lite Stemmer by Marie-Claire Jenkins and Dan J. Smith. Stemmers are NLP tools that get rid of inflectional and derivational affixes from words. In English, that usually means getting rid of the plural -s, progressive -ing, and preterite -ed. Depending on the type of stemmer, that [...]
First Impressions of Wolfram|Alpha
Posted: 16 May 2009 in UncategorizedTags: computational linguistics, google, google squared, knowledge engines, natural language processing, search engines, stephen wolfram, wikipedia, wolfram alpha
Perhaps you’ve heard of the latest brainchild of the Wunderkind Stephen Wolfram: Wolfram|Alpha. Matthew Hurst nicknamed it Alphram today and I agree that’s a much better name. Wolfram|Alpha (W|A henceforth) is not a search engine, it’s a knowledge engine. It will compete with Google on a slice of traffic that Google really isn’t all that [...]
Much ado about nothing
Posted: 26 March 2009 in UncategorizedTags: blagoblag, computational linguistics, disappointment, google, hype, information retrieval, search, semantic search
There has been much ballyhoo in the blogosphere touting Google’s so-called foray into semantic search. The blog post announcing the new feature doesn’t even mention the word semantics, but it does say it looks at associations and concepts related to your query. I see no mention of tuples or anything of the sort and the [...]
Books for Christmas
Posted: 4 January 2009 in UncategorizedTags: books, christmas, collective intelligence, computational linguistics, computer science, data visualization, evolutionary computing, genetic algorithms, string algorithms, web 2.0
I got most of the books I wanted the most for Christmas this year. It was a great haul that will keep me busy for a while. Among them were: Programming Collective Intelligence: Building Smart Web 2.0 Applications Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology Visualizing Data: Exploring and Explaining Data [...]


