A twitter friend (@communicating) tipped me off to the UEA-Lite Stemmer by Marie-Claire Jenkins and Dan J. Smith. Stemmers are NLP tools that get rid of inflectional and derivational affixes from words. In English, that usually means getting rid of the plural -s, progressive -ing, and preterite -ed. Depending on the type of stemmer, that [...]
Posts Tagged ‘information retrieval’
Porting the UEA-Lite Stemmer to Ruby
Posted: 16 July 2009 in UncategorizedTags: computational linguistics, finite state transducers, github, information retrieval, nlp, open source software, ruby, software, stemmers, stemming
Two new toys: G^2 and Bing
Posted: 3 June 2009 in UncategorizedTags: bing, google, google squared, image search, information retrieval, knowledge engine, search engines, wolfram alpha
This week has given me two new toys to play with, and you could probably say both were bought at the dollar store. The first was Microsoft‘s release of Rebranded Live, aka Bing. Bing’s search results have been poor (for me), but not much poorer than Google‘s. Just enough poorer for me to see no [...]
Much ado about nothing
Posted: 26 March 2009 in UncategorizedTags: blagoblag, computational linguistics, disappointment, google, hype, information retrieval, search, semantic search
There has been much ballyhoo in the blogosphere touting Google’s so-called foray into semantic search. The blog post announcing the new feature doesn’t even mention the word semantics, but it does say it looks at associations and concepts related to your query. I see no mention of tuples or anything of the sort and the [...]
Relevance-based language modeling
Posted: 14 October 2008 in UncategorizedTags: computational linguistics, google, information retrieval, language modeling, queries, relevance, search engines
I just finished reading about relevance-based language models for information retrieval (Lavrenko and Croft, 2001). It’s an old paper, but some new stuff I was checking into relied on something else which relied on it — you know how the story goes. In information retrieval, there are many retrieval models that have been used over [...]
Stacked Agents Model
Posted: 3 July 2008 in UncategorizedTags: cmu, collaborative filtering, computational linguistics, information retrieval, machine learning, presentations, recommender systems, research
This is research I did a while ago and presented Monday to fulfill the requirements of my Masters degree. The presentation only needed to be about 20 minutes, so it was a very short intro. We have moved on since then, so when I say future work, I really mean future work. The post is [...]
Hakia Semantic Search API
Posted: 21 June 2008 in UncategorizedTags: computational linguistics, hakia, information retrieval, semantic search, semantic web, startups, techcrunch, text summarization
If you follow news on the semantic web or new search engines, you may have heard of hakia. TechCrunch has done a small write up about their new semantic search API. TechCrunch is brutally hard on startups who aren’t fully operational, so there is a lot of criticism in that article that I take with [...]
Ensembles of kNN Recommenders
Posted: 1 April 2008 in UncategorizedTags: cmu, ensemble methods, ensembles, information retrieval, kdd, kdd cup, knn, machine learning, netflix prize, recommender systems, rmse
I’ve been messing around with recommender systems for the past year and a half, but not using the kNN (k-Nearest Neighbors) algorithm. However, my current homework assignment for my Information Retrieval class is to implement kNN for a subset of the Netflix Prize data. The data we are working with is about 800k ratings, which [...]
Who created OpenEphyra?
Posted: 16 February 2008 in UncategorizedTags: cmu, computational linguistics, information retrieval, lti, natural language processing, open source, openephyra, qa, question answering, software
OpenEphyra is a question answering (QA) system developed here at the Language Technologies Institute by Nico Schlaefer. He began his work at the University of Karlsruhe in Germany, but has since continued it at CMU and is currently a PhD student here. Since it is a home-grown language technologies package, I decided to check it [...]
What’s the information need, Kenneth?
Posted: 11 February 2008 in UncategorizedTags: google, information needs, information retrieval, library science, queries, search engines, structured queries, web search
When you go to a search engine, you have an information need. There is something you are searching for that you can only articulate imprecisely and you do so in a few words. People are good at determining if something satisfies their information need, but not so great at stating it clearly. Librarians are trained [...]


