<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>The Mendicant Bug &#187; python</title>
	<atom:link href="http://mendicantbug.com/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://mendicantbug.com</link>
	<description>Wanderings into computational linguistics, science, social media and life...</description>
	<lastBuildDate>Mon, 06 Feb 2012 20:54:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='mendicantbug.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/3c73bbb145eaa976335be29004da9868?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>The Mendicant Bug &#187; python</title>
		<link>http://mendicantbug.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://mendicantbug.com/osd.xml" title="The Mendicant Bug" />
	<atom:link rel='hub' href='http://mendicantbug.com/?pushpress=hub'/>
		<item>
		<title>NLP Resources for Ruby</title>
		<link>http://mendicantbug.com/2009/09/13/nlp-resources-for-ruby/</link>
		<comments>http://mendicantbug.com/2009/09/13/nlp-resources-for-ruby/#comments</comments>
		<pubDate>Sun, 13 Sep 2009 06:28:02 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[computational linguistics]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[nlp]]></category>
		<category><![CDATA[parsers]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[stemmers]]></category>
		<category><![CDATA[wordnet]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/?p=1268</guid>
		<description><![CDATA[There are quite a few well-known libraries for doing various NLP tasks in Java and Python, such as the Stanford Parser (Java) and the Natural Language Toolkit (Python).  For Ruby, there are a few resources out there, but they are usually derivative or not as mature.  By derivative, I mean they are ports from other [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=1268&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">There are quite a few well-known libraries for doing various NLP tasks in Java and Python, such as the Stanford Parser (Java) and the Natural Language Toolkit (Python).  For Ruby, there are a few resources out there, but they are usually derivative or not as mature.  By derivative, I mean they are ports from other languages or extensions using code from another language.  And I&#8217;m responsible for two of them! :)</p>
<ul>
<li><a title="Treat" href="https://github.com/louismullie/treat" target="_blank">Treat</a> &#8211; Text REtrieval and Annotation Toolkit, definitely the most comprehensive toolkit I&#8217;ve encountered so far for Ruby
<ul>
<li>Text extractors for various document formats</li>
<li>Chunkers, segmenters, tokenizers</li>
<li>LDA</li>
<li>much more &#8211; the list is big</li>
</ul>
</li>
<li><span style="background-color:#ffffff;"><a href="http://www.deveiate.org/projects/Linguistics/" target="_blank">Ruby Linguistics</a> &#8211; this is one of the more ambitious projects, but is not as mature as NLTK</span>
<ul>
<li><span style="background-color:#ffffff;">interface for WordNet</span></li>
<li><span style="background-color:#ffffff;">Link grammar parser</span></li>
<li><span style="background-color:#ffffff;">some inflection stuff</span></li>
</ul>
</li>
<li><a title="Stanford Core NLP" href="https://github.com/louismullie/stanford-core-nlp" target="_blank">Stanford Core NLP</a> &#8211; if you&#8217;ve gotten a headache trying to use the Java bridge, this is your answer</li>
<li><a href="http://rubyforge.org/projects/stanfordparser/" target="_blank">Stanford Parser</a> interface &#8211; uses a Java bridge to access the Stanford Parser library</li>
<li><a href="http://www.markwatson.com/" target="_blank">Mark Watson</a> has a <a href="http://www.markwatson.com/opensource/rubytagger_0.1.1.zip" target="_blank">part of speech tagger</a> [zip], a <a href="http://www.markwatson.com/opensource/rubyreuters_0.1.zip" target="_blank">text categorizer</a> [zip], and <a href="http://www.markwatson.com/opensource/ruby_read_docs.zip" target="_blank">some text extraction utilities</a> [zip], but I haven&#8217;t tried to use them yet</li>
<li><a href="http://github.com/ealdent/lda-ruby" target="_blank">LDA Ruby Gem</a>- Ruby port of David Blei&#8217;s lda-c library by yours truly
<ul>
<li>Uses Blei&#8217;s c-code for the actual LDA but I include some wrappers to make using it a bit easier</li>
</ul>
</li>
<li><a href="http://github.com/ealdent/uea-stemmer" target="_blank">UEA Stemmer</a> &#8211; Ruby port (again by yours truly) of a conservative stemmer based on Jenkins and Smith&#8217;s <a href="http://www.uea.ac.uk/cmp/research/graphicsvisionspeech/speech/WordStemming" target="_blank">UEA Stemmer</a></li>
<li><a href="http://rubyforge.org/projects/stemmer/" target="_blank">Stemmer gem</a> &#8211; <a href="http://tartarus.org/~martin/PorterStemmer/" target="_blank">Porter stemmer</a></li>
<li><a href="http://www.locknet.ro/projects/ann-ruby-stemmer" target="_blank">Lingua Stemmer </a>- another stemming library, Porter stemmer</li>
<li><a href="http://www.deveiate.org/projects/Ruby-WordNet/" target="_blank">Ruby WordNet </a>- basically what&#8217;s included in Ruby Linguistics</li>
<li><a href="http://sourceforge.net/projects/raspell/" target="_blank">Raspell</a> &#8211; Ruby interface to Aspell spell checker</li>
</ul>
<p style="text-align:justify;">There are also a number of fledgling or orphaned projects out there purporting to be ports or interfaces for various other libraries like Stanford POS Tagger and Named Entity Recognizer.  Ruby (straight Ruby, not just JRuby) can interface just about any Java library using the <a href="http://rjb.rubyforge.org/" target="_blank">Ruby Java Bridge</a> (RJB).  RJB can be a pain, and I could only initialize it once per run (a second attempt never succeeds), so there are some limitations.  But using it, I was able to easily interface with the Stanford POS tagger.</p>
<p style="text-align:justify;">So while there aren&#8217;t terribly many libraries for NLP tasks in Ruby, the availability of interfacing with Java directly widens the scope quite a bit.  You can also incorporate a c library using extensions.</p>
<p style="text-align:justify;">Naturally, if I missed anything, no matter how small, please let me know.</p>
<p style="text-align:justify;"><em>Update:</em> Here is a great list of <a href="http://web.media.mit.edu/~dustin/papers/ai_ruby_plugins/" target="_blank">AI-related ruby libraries</a> from Dustin Smith.</p>
<br />Posted in Uncategorized  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/1268/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/1268/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/1268/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/1268/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/1268/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/1268/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/1268/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/1268/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=1268&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2009/09/13/nlp-resources-for-ruby/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>%w{Scheme Python Rage Love Apathy}</title>
		<link>http://mendicantbug.com/2009/03/27/scheme-python-rage-love-apathy/</link>
		<comments>http://mendicantbug.com/2009/03/27/scheme-python-rage-love-apathy/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 03:30:11 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[cognitive psychology]]></category>
		<category><![CDATA[mit]]></category>
		<category><![CDATA[programming languages]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[scheme]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/?p=1115</guid>
		<description><![CDATA[John Cook just brought up the changeover from Scheme to Python in MIT&#8217;s beginning CS classes. I was exposed to Scheme very early in my programming career during my ill-fated quarter at the University of Chicago.  For some reason I can&#8217;t remember (it was 14 years ago), I registered late and couldn&#8217;t get into entry [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=1115&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">John Cook just <a href="http://www.johndcook.com/blog/2009/03/26/mit-replaces-scheme-with-python/" target="_blank">brought up</a> the changeover from <a href="http://blog.snowtide.com/2009/03/24/why-mit-now-uses-python-instead-of-scheme-for-its-undergraduate-cs-program" target="_blank">Scheme to Python</a> in MIT&#8217;s beginning CS classes.  I was exposed to Scheme very early in my programming career during my ill-fated quarter at the University of Chicago.   For some reason I can&#8217;t remember (it was 14 years ago), I registered late and couldn&#8217;t get into entry level CS classes.  So I enrolled in an AI class (against the advice of my undergrad advisor) without really knowing how to program.  This was old school AI, not machine learning, so it wasn&#8217;t the maths that got me.  The first programming assignment threw me completely for a loop &#8212; I had never seen Scheme before and didn&#8217;t know a thing about it.  My world up to point that had consisted of Pascal and BASIC, with a smattering of assembly.  The logic behind the AI stuff made sense, but the logistics of getting Scheme to do what I wanted escaped me and I dropped the class.  Turns out that advisor was worth listening to!</p>
<p style="text-align:justify;">Whenever something like this happens, you will see three groups of commenters emerge.  First are the I-don&#8217;t-care&#8217;s.  Actually, you don&#8217;t see them since they don&#8217;t give a crap.  The next are the fanboys.  They love the new language and are glad that MIT has discarded a dinosaur in favor of the <a href="http://xkcd.com/353/" target="_blank">language of Heaven</a>.  And finally you have the sticks in the mud who lament the death of computer science because a whole generation will grow up retarded thanks to not learning programming just the way they did.  Obviously, these are exaggerated &#8212; I say it to shock the mind.</p>
<p style="text-align:justify;">Cognitive psychology would have me believe that by drawing stark lines and exaggerating the situation, I will actually cause people to align themselves more closely with the stereotypes I laid out.  The logical alternative would be to view it as a joke, take a step back, and examine your own reaction.  Why do people get so worked up about this?  Why do I get so worked up about people getting so worked up?  :P</p>
<p style="text-align:justify;">Maybe I&#8217;m getting crotchety in my old age.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/1115/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=1115&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2009/03/27/scheme-python-rage-love-apathy/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>Google App Engine Hackathon</title>
		<link>http://mendicantbug.com/2008/11/15/google-app-engine-hackathon/</link>
		<comments>http://mendicantbug.com/2008/11/15/google-app-engine-hackathon/#comments</comments>
		<pubDate>Sun, 16 Nov 2008 02:11:40 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[atlanta]]></category>
		<category><![CDATA[computing puzzles]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[google app engine]]></category>
		<category><![CDATA[hackathon]]></category>
		<category><![CDATA[langwar]]></category>
		<category><![CDATA[project euler]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[stack overflow]]></category>
		<category><![CDATA[web apps]]></category>

		<guid isPermaLink="false">http://ealdent.wordpress.com/?p=862</guid>
		<description><![CDATA[I just spent the day with a couple of friends at the Google App Engine Hackathon in Atlanta.  We got to see Google Atlanta &#8211; or the public part of it anyway.  We weren&#8217;t permitted in the cafeteria or in the actual office area, which would have required signing non-disclosure agreements.  The office was about [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=862&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;"><img class="alignright" title="Google App Engine" src="https://www.google.com/accounts/ah/appengine.gif" alt="" width="145" height="111" align="right" />I just spent the day with a <a href="http://melinthropy.org/" target="_blank">couple</a> of <a href="http://wrathfuldove.org/" target="_blank">friends</a> at the <a href="http://code.google.com/appengine/" target="_blank">Google App Engine</a> <a href="http://sites.google.com/site/gaehackathonatlanta/" target="_blank">Hackathon</a> in Atlanta.  We got to see Google Atlanta &#8211; or the public part of it anyway.  We weren&#8217;t permitted in the cafeteria or in the actual office area, which would have required signing non-disclosure agreements.  The office was about what I expected &#8212; the Google colors were in abundance, there were giant bouncing balls, and free drinks! (non-alcoholic)</p>
<p style="text-align:justify;">We spent the day in a fairly hot conference room hacking away on a variety of projects.  We set up teams beforehand to work on projects that people proposed and I chose to work on a variation of a computing puzzles site, dubbed <a href="http://langwar.com" target="_blank">LangWar</a>.  The idea is fairly simple in the early stages:  people submit programming puzzles and other people post their solutions in code form.  You can vote which questions you like and which answers you like (or dislike).  You can also leave comments on questions and answers.  The result of the ratings is that the best questions will be counted higher, in a method similar to Reddit, and the best answers will trickle to the top based on the votes of users.</p>
<p style="text-align:justify;">This is very similar to <a href="http://stackoverflow.com/" target="_blank">Stack Overflow</a>, but different in that it is intended to be more of a puzzle solving site that pits implementations in different programming languages against each other.  It&#8217;s sort of a battle royale of programming languages &#8211; thus the name, LangWar.  It&#8217;s more of an enhanced version of <a href="http://projecteuler.net/" target="_blank">Project Euler</a>, where people can vote on the questions and answers.</p>
<p style="text-align:justify;">In any case, it was a great chance to get my hands dirty in Google App Engine, meet some Atlanta python coders, and have fun.  It&#8217;ll be interesting to see where LangWar goes from here, if it does go anywhere.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/862/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=862&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2008/11/15/google-app-engine-hackathon/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>

		<media:content url="https://www.google.com/accounts/ah/appengine.gif" medium="image">
			<media:title type="html">Google App Engine</media:title>
		</media:content>
	</item>
		<item>
		<title>Sentiment Polarity</title>
		<link>http://mendicantbug.com/2008/09/16/sentiment-polarity/</link>
		<comments>http://mendicantbug.com/2008/09/16/sentiment-polarity/#comments</comments>
		<pubDate>Wed, 17 Sep 2008 03:09:14 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[classification]]></category>
		<category><![CDATA[computational linguistics]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[opinion mining]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[sentiment analysis]]></category>
		<category><![CDATA[support vector machines]]></category>
		<category><![CDATA[svms]]></category>

		<guid isPermaLink="false">http://ealdent.wordpress.com/?p=747</guid>
		<description><![CDATA[I&#8217;ve begun learning ruby for my new job, a language that doesn&#8217;t seem to have really gotten any traction in the NLP community (at least not that I&#8217;ve heard).  I had been using python for my NLP stuff (homework and projects) and Java for my recommender system stuff.  In retrospect, I could have used python [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=747&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">I&#8217;ve begun learning <a href="http://www.ruby-lang.org/en/" target="_blank">ruby</a> for my new job, a language that doesn&#8217;t seem to have really gotten any traction in the NLP community (at least not that I&#8217;ve heard).  I had been using python for my NLP stuff (homework and projects) and Java for my recommender system stuff.  In retrospect, I could have used python for the recommender stuff, but I wasn&#8217;t aware of some speed-ups so resorted to Java.  Of course, the recommender stuff isn&#8217;t strictly NLP.  Ruby is just as well suited as python and seems a lot better than Java for many tasks (though Java certainly has its place).  At the very least, a scripting language like ruby or python is great for prototyping.  It&#8217;s easy to test new ideas quickly.</p>
<p style="text-align:justify;">I was reading through Pang et al (2002), which deals with classifying movie reviews as positive or negative.  They look at three machine learning approaches:  <a href="http://en.wikipedia.org/wiki/Naive_bayes" target="_blank">Naive Bayes</a>, <a href="http://en.wikipedia.org/wiki/Maximum_entropy_classifier" target="_blank">Maximum Entropy classifier</a> and<a href="http://en.wikipedia.org/wiki/Support_vector_machine" target="_blank"> Support Vector Machines</a>.  This seemed like a good opportunity to try out my nascent ruby skills, since it&#8217;s the kind of crap I can roll together in python in short order (and do all the time).  So I downloaded <a href="http://www.cs.cornell.edu/people/pabo/movie-review-data/" target="_blank">the data</a> for the paper (actually I downloaded the later data from the 2004 paper).  There are 1000 positive and 1000 negative movie reviews.  The task is to train a classifier to determine whether a review expresses a positive opinion (the author liked the movie) or a negative opinion (the author did not like the movie).  I chose to just use SVMs since they do best for this task according to the paper, they do really well for text categorization, and they are <a href="http://svmlight.joachims.org/" target="_blank">easy to use and download</a>.</p>
<p style="text-align:justify;">The results were quite nice.  Ruby turned out to be just as handy as python at manipulating text and dealing with crossfold validation:  the two main &#8220;challenges&#8221; in implementing this paper.  I used <a href="http://en.wikipedia.org/wiki/Tf-idf" target="_blank">tf-idf</a> for weighting the features and thresholded document frequency to discard words that didn&#8217;t appear in at least three reviews.  The result was that I achieved about 85.7% accuracy using the same cross validation setup described in their followup work (Pang and Lee, 2004).  In other words, the classifier could correctly guess the opinion orientation of reviews as positive or negative nearly 86% of the time.</p>
<p style="text-align:justify;">Pang et al (2002) discussed some of their errors and hypothesized that discourse analysis might improve results, since reviewers often use sarcasm.  There&#8217;s also the case where authors use a &#8220;thwarted expectations&#8221; narrative.  This offered me one of the few chuckles I&#8217;ve ever had while reading a research paper:</p>
<blockquote><p>&#8220;I hate the Spice Girls. &#8230; [3 things the author hates about them] &#8230;  Why I saw this movie is a really, really, really long story, but I did and one would think I&#8217;d despise every minute of it.  But&#8230; Okay, I&#8217;m really ashamed of it, but I enjoyed it.  I mean, I admit it&#8217;s a really awful movie &#8230;the ninth floor of hell&#8230; The plot is such a mess that it&#8217;s terrible.  But I loved it.&#8221;</p></blockquote>
<h3>References</h3>
<p>Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan.  &#8221;Thumbs Up?  Sentiment Classification Using Machine Learning Techniques.&#8221;  In <em>Proceedings of the ACL 02 conference on Empirical Methods in Natural Language Processing &#8211; Volume 10</em>, July 2002. [<a href="http://www.cs.cornell.edu/home/llee/papers/sentiment.pdf" target="_blank">pdf</a>]</p>
<p>Bo Pang and Lillian Lee.  &#8221;A Sentimental Education:  Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts.&#8221;  In <em>Proceedings of the ACL, </em>2004. [<a href="http://www.cs.cornell.edu/home/llee/papers/cutsent.pdf" target="_blank">pdf</a>]</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/747/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/747/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/747/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/747/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/747/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/747/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/747/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/747/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/747/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/747/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=747&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2008/09/16/sentiment-polarity/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>Plurk your tweets</title>
		<link>http://mendicantbug.com/2008/06/06/plurk-your-tweets/</link>
		<comments>http://mendicantbug.com/2008/06/06/plurk-your-tweets/#comments</comments>
		<pubDate>Fri, 06 Jun 2008 06:06:03 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[creative commons]]></category>
		<category><![CDATA[google code]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[plurk]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[software licensing]]></category>
		<category><![CDATA[status updates]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://ealdent.wordpress.com/?p=636</guid>
		<description><![CDATA[A couple of days ago, I wrote a script that would tweet anything you plurked. Thanks to some code from Neville Newey (based on PHP code by Charl van Niekerk), the plurk.py script I wrote has been updated to both plurk your tweets and tweet your plurks. This should work on both windows and linux [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=636&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">A couple of days ago, I wrote a script that would <a href="http://twitter.com/ealdent" target="_blank">tweet</a> anything you <a href="http://plurk.com/redeemByURL?from_uid=12298&amp;check=-1874285239&amp;s=1" target="_blank">plurked</a>.  Thanks to some code from <a href="http://blog.charlvn.za.net/2008/06/plurk-python-post-script.html" target="_blank">Neville Newey</a> (based on PHP code by <a href="http://blog.charlvn.za.net/2008/06/plurk-python-post-script.html" target="_blank">Charl van Niekerk</a>), the <a href="http://www.cs.cmu.edu/~jmadams/source/plurk.py" target="_blank">plurk.py</a> script I wrote has been updated to both plurk your tweets and tweet your plurks.  This should work on both windows and linux machines.  If you have access to a linux machine, I suggest setting up a cron job to take care of this.  As I mentioned in <a href="http://mendicantbug.com/2008/06/02/tweet-your-plurks/" target="_self">the previous post</a>, if you set up a cron job, be sure to change the path to plurkdb.dat to an absolute path.  I have done the most testing on this with python 2.4 in linux.</p>
<p style="text-align:justify;">This code is open source under the <span style="text-decoration:line-through;">Creative Commons 3.0 Attribution license that this blog uses</span> <a href="http://creativecommons.org/licenses/BSD/" target="_blank">Creative Commons BSD license</a>. Neville&#8217;s code appears to be under CC:Attribution 2.5 for South Africa, by what I could glean from his site.  I have considered making this an open source project under Google code but have yet to take it all the way.  Google sets a lifetime limit of 10 projects, so I will continue to hoard those against future need.  If you make modifications to the code, please let me know and I will probably post them here and in the code for future releases, so we all win.</p>
<p style="text-align:justify;">Note that the command line parameters have changed:</p>
<p style="text-align:justify;"><code>plurk.py &lt;twitter username&gt; &lt;twitter password&gt; &lt;plurk username&gt; &lt;plurk password&gt;</code></p>
<p style="text-align:justify;"><em>And of course, as with all software, use at your own risk.</em></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/636/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/636/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/636/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/636/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/636/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/636/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/636/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/636/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/636/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/636/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=636&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2008/06/06/plurk-your-tweets/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>Tweet your plurks</title>
		<link>http://mendicantbug.com/2008/06/02/tweet-your-plurks/</link>
		<comments>http://mendicantbug.com/2008/06/02/tweet-your-plurks/#comments</comments>
		<pubDate>Mon, 02 Jun 2008 06:31:26 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[download]]></category>
		<category><![CDATA[mirrors]]></category>
		<category><![CDATA[plurk]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[scripts]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[status updates]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://ealdent.wordpress.com/?p=632</guid>
		<description><![CDATA[If you want to use Plurk, but aren&#8217;t ready to leave Twitter, I wrote a little python script you can use to automatically mirror your plurks on Twitter. This will not work for response plurks, but your main plurks will be extracted and posted to your Twitter account with the prefix &#8220;plurking:&#8221; followed by your [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=632&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">If you want to use <a href="http://plurk.com/redeemByURL?from_uid=12298&amp;check=-1874285239&amp;s=1" target="_blank">Plurk</a>, but aren&#8217;t ready to leave <a href="http://twitter.com/ealdent" target="_blank">Twitter</a>, I wrote <a href="http://www.cs.cmu.edu/~jmadams/source/plurk.py">a little python script</a> you can use to automatically mirror your plurks on Twitter.  This will not work for response plurks, but your main plurks will be extracted and posted to your Twitter account with the prefix &#8220;plurking:&#8221; followed by your plurk.</p>
<p>The resulting tweet looks like this:</p>
<p style="text-align:center;"><img class="size-full wp-image-633" src="http://ealdent.files.wordpress.com/2008/06/plurk2tweet.jpg?w=614" alt="sample of what the script outputs in twitter"   /></p>
<p style="text-align:justify;"><a href="http://www.cs.cmu.edu/~jmadams/source/plurk.py">Download the script</a> and set it up as a cron job (or you could execute it manually).  It should work with python 2.4 and later.  It stores a plurkdb.dat file (which you should probably assign an absolute path to, depending on the behavior of cron on your system).  This file is checked every time it is run to make sure that duplicate plurks aren&#8217;t being tweeted.  You should pass the following parameters on the command line (or modify the script so they are hardcoded, if you want):  &lt;twitter username&gt; &lt;twitter password&gt; &lt;plurk username&gt; &lt;plurk password&gt;.  <em>Update:  see later post on updated plurk script.  And like with all software, use at your own risk.<br />
</em>
</p>
<p style="text-align:justify;">Please let me know if you have any problems with it or see room for improvement.  I hacked this out in a hurry, so &#8230;</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/632/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/632/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/632/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/632/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/632/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/632/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/632/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/632/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/632/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/632/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=632&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2008/06/02/tweet-your-plurks/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>

		<media:content url="http://ealdent.files.wordpress.com/2008/06/plurk2tweet.jpg" medium="image">
			<media:title type="html">sample of what the script outputs in twitter</media:title>
		</media:content>
	</item>
		<item>
		<title>OpenCalais</title>
		<link>http://mendicantbug.com/2008/05/31/opencalais/</link>
		<comments>http://mendicantbug.com/2008/05/31/opencalais/#comments</comments>
		<pubDate>Sat, 31 May 2008 21:44:57 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[calais]]></category>
		<category><![CDATA[computational linguistics]]></category>
		<category><![CDATA[named entity recognition]]></category>
		<category><![CDATA[newswire]]></category>
		<category><![CDATA[nlp]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[opencalais]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[reuters]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://ealdent.wordpress.com/?p=628</guid>
		<description><![CDATA[So I decided to finally fart around with OpenCalais a little. There&#8217;s a nice video on the site that gives you an impression of what it is capable of, but it&#8217;s also like all videos about software: propaganda. Calais is basically Named Entity Recognition (NER) software that can be accessed via a web API. Whereas [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=628&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">So I decided to finally fart around with <a href="http://opencalais.com/" target="_blank">OpenCalais</a> a little.  There&#8217;s a nice video on the site that gives you an impression of what it is capable of, but it&#8217;s also like all videos about software:   propaganda. Calais is basically Named Entity Recognition (NER) software that can be accessed via a web API.  Whereas a regular NER system might recognize named entities like people, organizations, and places, Calais also recognizes relationships like corporate acquisitions.  To be a little more clear if you aren&#8217;t familiar with NER, it is basically the task of identifying the proper nouns in a body of text.  Named entities aren&#8217;t always proper nouns, but that is one starting point.  Examples would be:  John Hancock (Person), New York (Place), and Apple (Organization).  Calais recognizes relationships, which means we get an extra layer of information:  Acquisition(Microsoft, Yahoo!).</p>
<p style="text-align:justify;">Calais is put out by Reuters which has a long history of helping out the NLP and IR research communities with data sets.  Being Reuters, the data sets are all newswire stuff, and Calais is produced in that spirit.  Currently the relationships and named entities available reflect that bias, but the list is expanding and it is probably flexible enough for most domains.  Their claim is that with each new release, there will be additional entities and relationships available.  Also, the software is completely <span style="text-decoration:line-through;">open source</span> free for commercial and private use.  For this, I give Reuters props.</p>
<p style="text-align:justify;">OpenCalais uses SOAP or HTTP post to issue requests and you can take a look at their tutorials for exactly how to use it.  After some very shallow digging on the googles, I found an open source project called <a href="http://code.google.com/p/python-calais/" target="_blank">python-calais</a>, which is basically just a script that wraps some text and sends it to the Calais service, then processes the output.  The output is in RDF (<a href="http://en.wikipedia.org/wiki/Resource_Description_Framework" target="_blank">resource description framework</a>), which is a type of xml document that is not very friendly to the human eye but is nice and powerful otherwise.  The python-calais script uses an rdf library for python, so you&#8217;ll need to download that if you don&#8217;t already have it.</p>
<p>Running it on <a href="http://mendicantbug.com/2007/11/08/old-english-translator/" target="_self">my most popular post</a>, you get the following output:</p>
<p><code>93B6642D-0D7C-37Ab-A92F-66Ebfef13C8D :: Recommender Systems (Industryterm)<br />
0Dccb106-442A-3848-Bd0B-A388E73F4C8C :: Chris Sternal-Johnson (Person)<br />
Aab0D16A-Ad5A-348A-A8Dc-58Cf59A1Bc15 :: Kristina Tikhova (Person)<br />
42F476A0-2Fae-3F36-808D-803E4F620Ab0 :: Java (Technology)<br />
6C4Cd5D9-5866-35B5-81Ab-B8A5C1751A44 :: Pre-Processing Phase (Industryterm)<br />
4003D863-C7A6-3E6F-8E3C-0913Bf2F8242 :: National Aeronautics And Space Administration (Organization)<br />
77D1Ceb3-9900-3Dd7-8351-F29408B21412 :: Carnegie Mellon University (Organization)<br />
Ee58Ef4B-1C98-3F8B-Aff8-3Fd6E3D76A9E :: Wonderful Site (Industryterm)<br />
8F12E551-A8F1-3705-866C-D44D1A6A54F4 :: Richard M. Hogg (Person)<br />
Adee23De-B1B0-37Ad-9E20-1Fa8094F6D39 :: Steel (Industryterm)<br />
0Ace00C6-2B9F-32C2-8949-82A0F6C6B444 :: Xml (Technology)<br />
2Ed2F085-1C63-324E-B518-60332388E273 :: Norman French (Person)<br />
136157D8-D62E-3C55-Ae67-3Ec182C2C703 :: Phil Barthram (Person)<br />
B6A8Dbfa-Fd35-32Bb-9E05-A2811C480000 :: Mike Tan (Person)<br />
Ed8B5Fe4-616A-36Ea-8C47-3Eea7C71Aee0 :: Ben Eastaugh (Person)<br />
D3Bcba58-00Fc-33C5-9346-Dbf6A2441867 :: Machine Learning (Technology)<br />
F17C3779-3810-3Ff9-A42D-75C3137F0F7F :: Modern English (Person)<br />
38116E8D-F8B4-3D03-B0Ad-C9A24B888E61 :: Jason M. Adams (Person)<br />
4386B07C-F6B8-3991-Af74-Ab11A951F0Ee :: David Petar Novakovic (Person)<br />
Aa14303F-F9F0-31B8-Adff-3B9C68E0A9F1 :: Language Technologies Institute (Organization)<br />
Ca1E4Eb7-7820-3862-8443-26E37B33E13F :: Machine Translation (Technology)</code></p>
<p style="text-align:justify;">As it picks up everything on the page, there is a lot included there that isn&#8217;t related to the post about Old English translation.  Also, it picks up some weird so-called industry terms like &#8220;steel.&#8221;  If you filter out just the text (manually), the output is a little more sensible:</p>
<p><code>6C4Cd5D9-5866-35B5-81Ab-B8A5C1751A44 :: Pre-Processing Phase (Industryterm)<br />
Ca1E4Eb7-7820-3862-8443-26E37B33E13F :: Machine Translation (Technology)<br />
0Ace00C6-2B9F-32C2-8949-82A0F6C6B444 :: Xml (Technology)<br />
2Ed2F085-1C63-324E-B518-60332388E273 :: Norman French (Person)<br />
136157D8-D62E-3C55-Ae67-3Ec182C2C703 :: Phil Barthram (Person)<br />
F17C3779-3810-3Ff9-A42D-75C3137F0F7F :: Modern English (Person)</code></p>
<p style="text-align:justify;">(The codes are unique identifiers.)  Unfortunately, some important terms are still missed, like <em>Old English</em>.  So it appears Calais has some growing to do, but it&#8217;s off to a good start.  Part of the problem might be that that blog post is out of domain.  I imagine with time, it will continue to improve.  We&#8217;ll see.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/628/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/628/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/628/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/628/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/628/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/628/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/628/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/628/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/628/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/628/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=628&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2008/05/31/opencalais/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>Pythonic Matlab</title>
		<link>http://mendicantbug.com/2008/05/15/pythonic-matlab/</link>
		<comments>http://mendicantbug.com/2008/05/15/pythonic-matlab/#comments</comments>
		<pubDate>Thu, 15 May 2008 15:49:26 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[computer science]]></category>
		<category><![CDATA[functions]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[memory management]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[programming languages]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://ealdent.wordpress.com/?p=614</guid>
		<description><![CDATA[I attended a Matlab training seminar yesterday with the dual topics of &#8220;Advanced Matlab Programming&#8221; and &#8220;Distributed and Parallel Computing.&#8221; Of the two, the Advanced section was more interesting, though my original motivation for going was the parallel computing part. In the morning, I felt like it was going to be a waste because my [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=614&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:justify;">I attended a Matlab training seminar yesterday with the dual topics of &#8220;Advanced Matlab Programming&#8221; and &#8220;Distributed and Parallel Computing.&#8221;  Of the two, the Advanced section was more interesting, though my original motivation for going was the parallel computing part.  In the morning, I felt like it was going to be a waste because my Matlab programming skills are weak, and if my advisor had not strongly suggested I attend, I might&#8217;ve skipped it.  I&#8217;m glad he did, because it was surprisingly enjoyable and I felt like it was right on my level.  This might be because programming in Matlab isn&#8217;t especially hard or different from other programming languages and I know enough to get by already.  Or it might be because Matlab is becoming a little more like Python.<span id="more-614"></span></p>
<p style="text-align:justify;">First let me clarify by saying that I&#8217;m not suggesting Matlab is copying Python (though they may be) or that Python has the market cornered on the similarities I&#8217;m about to talk about (it doesn&#8217;t).  Also, I have no idea when this stuff was introduced to Matlab, so it might have all been there for years.</p>
<p style="text-align:justify;">The first thing of interest to me was the discussion about memory management in Matlab.  Suppose you create a matrix and store it in the variable X.  Next, you assign X to Y:  <code>Y = X</code>.  Here, Y acts as a pointer to the data pointed to by X.  If you clear X, Y still points to the data, so it continues to reside in memory until Y has been cleared also.  Now if you modify Y, perhaps like so:  <code>Y(1,1) = 2</code>, my thought was that both X and Y would be affected and the memory used would stay the same.  However, that is not the case, thanks to the <a href="http://en.wikipedia.org/wiki/Blas" target="_blank">BLAS</a> and <a href="http://en.wikipedia.org/wiki/LAPACK" target="_blank">LAPACK</a> libraries that require matrices to be contiguous in memory.  So when you modify Y as above, it creates a modified copy in memory that Y now points to, while X remains pointing to the same, unmodified data.  This is decidedly not Pythonic, since in Python that modification would have updated both X and Y, and they still would be pointing to the same thing in memory.</p>
<p style="text-align:justify;">The next set of interesting things were functions.  Here is where things struck me as Pythonic.   You can declare a simple function in an m-file like so:</p>
<p><code>function Y = square(x)<br />
Y = x.^2<br />
end</code></p>
<p style="text-align:justify;">Here the <code>end</code> is optional.  Like in Python, you can use function handles.  This lets you do cool things like decide on the fly which operation you can apply to a matrix without having to worry about record-keeping and if checks (I&#8217;m not going into this any further).  To demonstrate the function handle, here is one possibility:</p>
<p><code>fh = @sin<br />
fh(pi)<br />
fh = @cos<br />
fh(pi)</code></p>
<p style="text-align:justify;">We call the function handle fh on the same input, but get two different results:  0 and -1.  Matlab also lets you create factory functions like in Python:</p>
<p><code> function Y = makeF(a,b)<br />
Y = @makeFHelper;<br />
function Z = makeFHelper(x)<br />
Z = a * sin(x) + b;<br />
end<br />
end<br />
</code></p>
<p style="text-align:justify;">This returns a function handle to the subfunction makeFHelper with the parameters you passed hard-coded in the function produced.  So you could create two different functions:</p>
<p><code>Y1 = makeF(2,3)<br />
Y2 = makeF(3,-1)</code></p>
<p>and when you execute them:</p>
<p><code>Y1(pi/3)<br />
Y2(pi/3)</code></p>
<p style="text-align:justify;">you get two different results: 4.7321 and 1.5981.  If I had known about this before, I might&#8217;ve put more effort into learning Matlab programming, since it would have come in handy for a couple assignments this semester.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/614/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/614/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/614/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/614/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/614/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/614/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/614/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/614/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/614/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/614/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=614&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2008/05/15/pythonic-matlab/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>Python ecstacy</title>
		<link>http://mendicantbug.com/2007/12/05/python-ecstacy/</link>
		<comments>http://mendicantbug.com/2007/12/05/python-ecstacy/#comments</comments>
		<pubDate>Wed, 05 Dec 2007 05:50:44 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[humor]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[programming languages]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[python love]]></category>
		<category><![CDATA[xkcd]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/2007/12/05/python-ecstacy/</guid>
		<description><![CDATA[From the most excellent xkcd: I felt the exact same way when I first picked up python.  It was like finding the holy grail of programming languages.  To be able to just throw things into a list and access them without having to worry about casting.  To throw around functions like they were variables.  To [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=367&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p align="justify">From the most excellent <a href="http://www.xkcd.com" target="_blank">xkcd</a>:</p>
<p align="center"><a href="http://xkcd.com/353/" target="_blank"><img src="http://imgs.xkcd.com/comics/python.png" width="490" /></a></p>
<p align="justify"> I felt the exact same way when I first picked up <a href="http://www.python.org/" target="_blank">python</a>.  It was like finding the holy grail of programming languages.  To be able to just throw things into a list and access them without having to worry about casting.  To throw around functions like they were variables.  To weave functions out of thin air and watch them vanish when their usefulness had expired.  It was magic.</p>
<p align="justify">Of course, the honeymoon faded.  I still use python as a first resort.  As a programming language for exploring new ideas, it can&#8217;t be beaten.  Development time is ridiculously fast.  There has been effort to get the runtime up to snuff as well, but with much reluctance I&#8217;m forced to admit it doesn&#8217;t compare to C or even Java, may God have mercy on my soul.  Granted, it all depends on the application, blah blah blah.</p>
<p align="justify">Despite all that, I still love it.  It&#8217;s definitely first in my heart as far as programming languages go.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/367/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/367/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/367/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=367&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2007/12/05/python-ecstacy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>

		<media:content url="http://imgs.xkcd.com/comics/python.png" medium="image" />
	</item>
		<item>
		<title>Simple Cellular Automata</title>
		<link>http://mendicantbug.com/2007/10/28/simple-cellular-automata/</link>
		<comments>http://mendicantbug.com/2007/10/28/simple-cellular-automata/#comments</comments>
		<pubDate>Sun, 28 Oct 2007 17:53:13 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[books]]></category>
		<category><![CDATA[cellular automata]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[experiments]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[wolfram]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/2007/10/28/simple-cellular-automata/</guid>
		<description><![CDATA[So I&#8217;ve been reading A New Kind of Science by Stephen Wolfram, the creator of Mathematica. It was hyped up big time back when he first wrote it, since he had gone silent for a number of years, hinting that he was about to do something big. So my middle little sister got me the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=259&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p align="justify"><a class="DiggThisButton DiggMedium" href="http://digg.com/submit?url=http%3A%2F%2Fmendicantbug.com%2F2007%2F10%2F28%2Fsimple-cellular-automata%2F&amp;title=Simple+Cellular%26nbsp%3BAutomata"></a></p>
<p align="justify">So I&#8217;ve been reading <a href="http://www.amazon.com/gp/redirect.html?ie=UTF8&amp;location=http%3A%2F%2Fwww.amazon.com%2FNew-Kind-Science-Stephen-Wolfram%2Fdp%2F1579550088%3Fie%3DUTF8%26s%3Dbooks%26qid%3D1193591006%26sr%3D8-1&amp;tag=themenbug-20&amp;linkCode=ur2&amp;camp=1789&amp;creative=9325">A New Kind of Science</a><img src="http://www.assoc-amazon.com/e/ir?t=themenbug-20&amp;l=ur2&amp;o=1" style="border:medium none !important;margin:0 !important;" border="0" height="1" width="1" /> by Stephen Wolfram, the creator of Mathematica.  It was hyped up big time back when he first wrote it, since he had gone silent for a number of years, hinting that he was about to do something <strong>big<em>.</em></strong>  So my middle little sister got me the book for Christmas (cuz she rocks) and I cracked it open a few times.  It&#8217;s about 846 pages of text (yipes!) and then another 351 pages of notes.  Quite daunting.  So I put it down and have meant to pick it back up a thousand times.  Today I was needing a diversion because a particular C++ issue was giving me fits.</p>
<p align="justify">&nbsp;</p>
<p align="justify">In Chapter 2, Wolfram introduces a fairly simple 2-dimensional cellular automata (one spatial dimension, one temporal dimension).  The temporal dimension can be plotted as another spatial dimension producing a nice little spreadsheet style graph.  Each cell of the graph can be considered a bit.  Depending on whether the bit is set, the cell is either shaded or not.  So the single line in the spatial dimension contains some initial setting.  Let&#8217;s say there is one single bit set in the middle of the line, so it might look like this:</p>
<p align="center"> 000000000010000000000</p>
<p align="left"><span id="more-259"></span></p>
<p align="justify"> To move to the next time step, we must look at the neighborhood of each cell.  A neighborhood for the middle cell of length 3 would be [010].  Another way of thinking about this neighborhood is as the bits of a number.  Here we have three bits so we can represent the numbers 0-7.  For each possible setting, we can create a rule.  The rule simply says whether the bit is set or not.  So let&#8217;s use the following rules:</p>
<ul>
<li>0 =&gt; 0</li>
<li>1 =&gt; 1</li>
<li>2 =&gt; 0</li>
<li>3 =&gt; 1</li>
<li>4 =&gt; 1</li>
<li>5 =&gt; 0</li>
<li>6 =&gt; 1</li>
<li>7 =&gt; 0</li>
</ul>
<p align="justify">As we step through our board, we look at the neighborhood of three for each cell.  Let&#8217;s assume wrapping occurs at the edges, so for the first cell, the neighborhood is [000].  Most are zeros until we get to the 10th index (bolded, neighborhood underlined):</p>
<p align="center">00000000<u><strong>0</strong>01</u>0000000000</p>
<p align="justify"> The neighborhood now is [001], which corresponds to 1 in our rules.  So we output 1 in our new line.  The next neighborhood is 010, which corresponds to rule 2.  So we output 0.  Next we have [100], which is rule 4, so we output 1.  After we&#8217;re done processing this line, we have the new line at time step 2:</p>
<p align="center">000000000101000000000</p>
<p> As we iterate through this ten more times, we begin to notice a pattern emerging:</p>
<p align="center"><code>000000000010000000000<br />
000000000101000000000<br />
000000001000100000000<br />
000000010101010000000<br />
000000100000001000000<br />
000001010000010100000<br />
000010001000100010000<br />
000101010101010101000<br />
001000000000000000100<br />
010100000000000001010<br />
100010000000000010001</code></p>
<p>Actually, it&#8217;s probably harder to notice in this crappy text form, so here is the graphical representation:
</p>
<p style="text-align:center;"><img src="http://ealdent.files.wordpress.com/2007/10/sierpca.png?w=614" alt="Sierpinski triangle made with cellular automata" /></p>
<p align="justify">So with just those simple rules we could construct a fractal known as the Sierpinski triangle.  Pretty frickin sweet.  In Wolfram&#8217;s nomenclature, this is cellular automaton 90 (since the rule outputs form the binary number 01011010 = 90).  I hacked together some quick python code to display such cellular automata given the size of the neighborhood and the Wolfram number for the automaton you want.</p>
<p align="justify">&nbsp;</p>
<p align="justify">So to run the code below for the Sierpinksi triangle for 40 cycles (recommended), you&#8217;d use the format:</p>
<p align="justify">&nbsp;</p>
<p align="center">python ca.py 3 90 40</p>
<p align="justify">&nbsp;</p>
<p align="justify">Download it <a href="http://www.cs.cmu.edu/~jmadams/source/ca.py" title="Cellular automata script in python following Wolfram's description in Chapter 2 of A New Kind of Science" target="_blank">here</a> since this sourcecode plugin isn&#8217;t highly reliable and may have bugs in the output it shows you.</p>
<p><pre class="brush: python;">
# cellular automaton

__version__ = &quot;0.1&quot;
__author__ = &quot;Jason Michael Adams (jmadams@cs.cmu.edu)&quot;

import math
import sys

class CellularAutomaton(object):
   &quot;&quot;&quot;Usage: CellularAutomaton(boardSize, ruleSize, automatonNumber)

   &quot;&quot;&quot;

   def __init__(self, size, rule_size, effects, initial_board=None,alpha=None):
      assert size % 2 == 1, &quot;Size of board must be odd.&quot;
      self.size = size
      self.time = 0
      self.rule_size = rule_size
      rules = list()
      for x in xrange(2 ** rule_size):
         rules.append(make_binary_rep(x, rule_size))
      rules.reverse()
      ex = make_binary_rep(effects, 2 ** rule_size)
      self.rules = RuleSet(rules, ex)
      self.board = dict()
      if alpha is None:
         self.alpha = [0,1]
      else:
         self.alpha = alpha

      self.board[0] = self._init_board(initial_board)
      if initial_board is None:
         self.board[0][self.size / 2] = self.alpha[1]

   def increment(self):
      self.time += 1
      self.board[self.time] = self._init_board()

      for x in xrange(self.size):
         context = list()
         rn = self.rule_size / 2
         for y in xrange(self.rule_size):
            context.append(self.board[self.time - 1][(x + y - rn) % self.size])

         self.board[self.time][x] = self.rules * context

      return self.board[self.time]

   def run(self, cycles):
      for x in xrange(cycles):
         tmp = self.increment()

      print self

   def _init_board(self, initial_board=None):
      if initial_board is None:
         board = list()
         for x in xrange(self.size):
            board.append(self.alpha[0])
         board[self.size / 2] = self.alpha[1]
      else:
         assert len(initial_board) == size, &quot;Size of initial board setting must be equal to size of board.&quot;
         board = initial_board

      return board

   def __repr__(self):
      keys = self.board.keys()
      keys.sort()
      tmps = list()
      for key in keys:
         tmps.append(self.__repb__(key))
      return '\n'.join(tmps)

   def __repb__(self, b):
      board = self.board[b]
      tmp = &quot;&quot;
      for x in xrange(len(self.board[b])):
         tmp += str(self.board[b][x])
      return tmp

class Rule(object):
   &quot;&quot;&quot;Usage:  Rule(rule, effect)

   The rule parameter is a odd-length list indicating the
   prior pattern that will trigger the single value in
   effect.  The effect parameter should be anything that
   has a string representation.
   &quot;&quot;&quot;

   def __init__(self, rule, effect):
      assert len(rule) % 2 == 1, &quot;Size of rule must be odd.&quot;
      self.rule = rule
      self.effect = effect
      self.size = len(rule)

   def __mul__(self, context):
      assert len(context) == self.size, &quot;Size of context must be equal to size of the rule (%d).&quot; % (self.size)

      accept = True

      for x in xrange(len(context)):
         if self.rule[x] != context[x]:
            accept = False
            break

      if accept is True:
         return self.effect
      else:
         return None

   def __repr__(self):
      return &quot;%s =&gt; %s&quot; % (str(self.rule), self.effect)

class RuleSet(object):
   &quot;&quot;&quot;A set of Rule objects that can be applied to
   a given context.
   &quot;&quot;&quot;

   def __init__(self, rules, effects):
      self.rules = list()
      for rx in xrange(len(rules)):
         r = Rule(rules[rx], effects[rx])
         self.rules.append(r)

   def __mul__(self, context):
      middle = context[len(context) / 2]
      for rule in self.rules:
         outp = rule * context
         if outp is not None:
            return outp
      return middle

   def __repr__(self):
      return str(self.rules)

def make_binary_rep(num, size=0):
   outp = list()
   if size &lt; = 0:
      size = int(round(math.log(num) / math.log(2)))
   for x in xrange(size - 1, -1, -1):
      if (num &amp; (2 ** x)) &gt; 0:
         outp.append(1)
      else:
         outp.append(0)
   return outp

if __name__ == '__main__':
   if len(sys.argv) &lt; 4:
      print &quot;Usage: %s   &lt;# cycles&gt;&quot; % (sys.argv[0])
      sys.exit(0)
   nhsize  = int(sys.argv[1])
   autonum = int(sys.argv[2])
   cycles  = int(sys.argv[3])
   ca = CellularAutomaton(81, nhsize, autonum)
   ca.run(cycles)</pre></p>
<p align="justify">This is some pretty simple stuff, but I figured I&#8217;d post it in case anyone wanted a starting point or just something to play with.  There are also a number of things you can do to increase speed that I didn&#8217;t bother with.  Please report bugs, etc.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/259/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/259/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/259/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/259/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/259/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/259/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/259/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/259/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/259/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/259/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=259&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2007/10/28/simple-cellular-automata/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>

		<media:content url="http://www.assoc-amazon.com/e/ir?t=themenbug-20&#38;l=ur2&#38;o=1" medium="image" />

		<media:content url="http://ealdent.files.wordpress.com/2007/10/sierpca.png" medium="image">
			<media:title type="html">Sierpinski triangle made with cellular automata</media:title>
		</media:content>
	</item>
		<item>
		<title>Substitution Ciphers</title>
		<link>http://mendicantbug.com/2007/10/04/substitution-ciphers/</link>
		<comments>http://mendicantbug.com/2007/10/04/substitution-ciphers/#comments</comments>
		<pubDate>Thu, 04 Oct 2007 18:36:56 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[ciphers]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[cryptograms]]></category>
		<category><![CDATA[cryptography]]></category>
		<category><![CDATA[encryption]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[substitution cipher]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/2007/10/04/substitution-ciphers/</guid>
		<description><![CDATA[For one of my homework assignments, I have to solve words encrypted via a substitution cipher. These ciphers were insecure before computers came around, but they are still fun. If you&#8217;re unfamiliar with them, you&#8217;d probably recognize them as the cryptograms (&#8220;Cryptoquote&#8221;) in your local newspaper. In the simplest form, each letter is mapped to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=198&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>For one of my homework assignments, I have to solve words encrypted via a substitution cipher.  These ciphers were insecure before computers came around, but they are still fun.  If you&#8217;re unfamiliar with them, you&#8217;d probably recognize them as the cryptograms (&#8220;Cryptoquote&#8221;) in your local newspaper.  In the simplest form, each letter is mapped to a different letter of the alphabet.  A lot of people do these for fun and I know at least <a href="http://melinda.theweatherses.org/" title="Melinthropy" target="_blank">one person</a> reading this does.  The result is a run of text that might look like:</p>
<p align="center"> <tt>ov umy rfgs f nmg</tt><br />
<tt>MY DOG EATS A LOT</tt></p>
<p>There are many ways of going about solving substitution ciphers, but a common way is by counting frequencies of characters.  As most people know, <em>e </em>tends to occur more than other letters in most written English.  The rest of the letters typically follow a pattern, as well, but that pattern degenerates once you leave the most common letters.  The domain of the text you are examining is fundamentally important here.  By domain I mean whether this text is from a newspaper, an IM, transcribed speech, etc.  You can also look at bigrams, two character sequences, to find the most commonly appearing sequences.  In English, <em>th</em> appears much more <em>ty</em>, but <em>ty</em> still occurs.  When trying to solve substitution ciphers this way, you are essentially matching the frequency distribution of the cipher text to the distribution of English and building a mapping from there.  To put that a different way, you are matching up the most common letters or sequences in the garbled text with the most common real English letters or sequences.</p>
<p>Once the frequency counts have revealed the most common letters, many people proceed to deduction to eliminate the rest.  Of course, this requires knowledge of English words directly, which has an impact on computational approaches to solving substitution ciphers automatically.  I&#8217;m curious what approaches people have taken (if any) other than using a dictionary of English words and trying to find matches from there.</p>
<p><span id="more-198"></span><br />
So, I&#8217;ll talk more about this later, after the homework assignment is due, but there is an interesting connection here with finite state transducers.  Here is a little bit of python code to do some simple stuff with substitution ciphers in case you want to play around.  I have found python a very handy tool in decrypting these ciphers.  Anyone have any other tool they like and want to share?</p>
<p><pre class="brush: python;">
def decrypt(words, charmap):
&nbsp;&nbsp;&nbsp;&nbsp;# Given the ciphertext in lowercase (words) and a
&nbsp;&nbsp;&nbsp;&nbsp;# mapping of ciphertext to upper case plaintext
&nbsp;&nbsp;&nbsp;&nbsp;# (charmap), return the transformed string
&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;tmp = [" "] * len(words)
&nbsp;&nbsp;&nbsp;&nbsp;for x in xrange(len(words)):
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if charmap.has_key(words[x]):
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tmp[x] = charmap[words[x]]
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tmp[x] = words[x]

&nbsp;&nbsp;&nbsp;&nbsp;return tmp



def encrypt(words, charmap):
&nbsp;&nbsp;&nbsp;&nbsp;# begin by making a reverse map
&nbsp;&nbsp;&nbsp;&nbsp;rev_charmap = dict()
&nbsp;&nbsp;&nbsp;&nbsp;for key in charmap:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;rev_charmap[charmap[key]] = key

&nbsp;&nbsp;&nbsp;&nbsp;return decrypt(words, rev_charmap)



def make_random_charmap():
&nbsp;&nbsp;&nbsp;&nbsp;letters = list()
&nbsp;&nbsp;&nbsp;&nbsp;# make the list of uppercase letters
&nbsp;&nbsp;&nbsp;&nbsp;for x in xrange(26):
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;letters.append(chr(ord('A') + x))
&nbsp;&nbsp;&nbsp;&nbsp;random.shuffle(letters)
&nbsp;&nbsp;&nbsp;&nbsp;charmap = dict()
&nbsp;&nbsp;&nbsp;&nbsp;for x in xrange(26):
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;charmap[chr(ord('a') + x)] = letters[x]

&nbsp;&nbsp;&nbsp;&nbsp;return charmap
</pre></p>
<p>And I&#8217;m only semi-fond of this plugin for source code.  It does some weird crap and is really temperamental.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/198/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/198/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/198/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=198&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2007/10/04/substitution-ciphers/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>Ultrafast thumb-only keyboard</title>
		<link>http://mendicantbug.com/2007/09/28/ultrafast-thumb-only-keyboard/</link>
		<comments>http://mendicantbug.com/2007/09/28/ultrafast-thumb-only-keyboard/#comments</comments>
		<pubDate>Fri, 28 Sep 2007 14:14:42 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[code]]></category>
		<category><![CDATA[computational linguistics]]></category>
		<category><![CDATA[corpora]]></category>
		<category><![CDATA[english]]></category>
		<category><![CDATA[kannuu]]></category>
		<category><![CDATA[keyboard]]></category>
		<category><![CDATA[language model]]></category>
		<category><![CDATA[n-grams]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[startup]]></category>
		<category><![CDATA[startups]]></category>
		<category><![CDATA[typing]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/2007/09/28/ultrafast-thumb-only-keyboard/</guid>
		<description><![CDATA[In a recent press release, kannuu is claiming to have revolutionized text entry. They claim that you can now perform text entry with just your thumb at the same speed of a regular keyboard. Too good to be true? Here is their method, complete with Hype™. &#8220;Advancing text entry exponentially, kannuu’s powerful and precise Partial [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=173&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In a recent press release, <a href="http://kannuu.com" title="kannuu" target="_blank">kannuu </a>is claiming to have revolutionized text entry.  They claim that you can now perform text entry with just your thumb at the same speed of a regular keyboard.  Too good to be true?   Here is their method, complete with Hype™.</p>
<blockquote><p>&#8220;Advancing text entry exponentially, kannuu’s powerful and precise Partial Word Completion® technology enables users with a fail-safe text entry solution. The kannuu application appears on device, as a four-point diamond shape, comprised of the most popular letters in the database it is indexing, with the center kannuu logo leading to the next set of choices.&#8221;</p></blockquote>
<p>They registered a trademark on the phrase &#8220;partial word completion&#8221;??  Blerg.  Not only do they have an über lame web 2.0 name in lowercase, they gotta stop people from marketing a similar technology under their oh-so-not-original name.  Why does this make me so angry?  Anyhow, I&#8217;m running off sideways on a rant that is pretty insignificant.</p>
<p>The real point here is the potential for coolness.  So here is the technology:  you enter a letter, it presents you with a &#8220;diamond&#8221; shape and the most common letters or group of letters that follow the letter(s) you just entered.  In this way, most of your everyday phrases will be right up at the top of the list of things you&#8217;re presented so you could potentially be entering words with fewer keystrokes and all with very little thumb movement.  This could really revolutionize key input and maybe bring pocket computers to reality [<a href="http://blog.sciam.com/index.php?title=the_keyboard_is_dead_long_live_whatever&amp;more=1&amp;c=1&amp;tb=1&amp;pb=1" title="Scientific American blog" target="_blank">source</a>].</p>
<p>So here is what I think the technology is based on.  A very common technique in language technologies is the use of n-grams.   So they use a character-based n-gram model to predict the most common letter or letters that you would type next based on some corpus.  This isn&#8217;t anything new.  Cell phones already have a T9 input method that guesses the most common word based on the single letters you choose.  This isn&#8217;t all that different.  If they have done the interface well, that could be a serious improvement.</p>
<p>If you&#8217;re interested in character-based n-gram models, I go into them in more depth after the jump.</p>
<p><span id="more-173"></span><br />
Think about this.  In English, the most common letter is <em>e</em>.  If you don&#8217;t believe me, count the letters in the paragraph above that starts with &#8220;the real point&#8221;.</p>
<table align="center" border="1">
<tr>
<th>Letter</th>
<th>Count</th>
<th>Letter</th>
<th>Count</th>
<th>Letter</th>
<th>Count</th>
</tr>
<tr>
<td>e</td>
<td>60</td>
<td>h</td>
<td>21</td>
<td>c</td>
<td>8</td>
</tr>
<tr>
<td>t</td>
<td>53</td>
<td>a</td>
<td>19</td>
<td>a</td>
<td>7</td>
</tr>
<tr>
<td>o</td>
<td>41</td>
<td>y</td>
<td>17</td>
<td>g</td>
<td>6</td>
</tr>
<tr>
<td>r</td>
<td>30</td>
<td>u</td>
<td>16</td>
<td>b</td>
<td>5</td>
</tr>
<tr>
<td>s</td>
<td>27</td>
<td>p</td>
<td>13</td>
<td>v</td>
<td>4</td>
</tr>
<tr>
<td>l</td>
<td>25</td>
<td>d</td>
<td>11</td>
<td>k</td>
<td>4</td>
</tr>
<tr>
<td>i</td>
<td>24</td>
<td>m</td>
<td>10</td>
<td>z</td>
<td>1</td>
</tr>
<tr>
<td>n</td>
<td>22</td>
<td>w</td>
<td>8</td>
<td>j</td>
<td>1</td>
</tr>
</table>
<p>You can take these counts and get probabilities for the occurrences of each letter.  These probabilities are useful because they give you a general idea of how often each letter occurs in English.  Of course, the corpus I&#8217;m pulling them from is way too small to be representative of all of English, but it&#8217;s a start.  Since we are looking at the probability of only one character, we call this a unigram model (1-gram).  We can generate English text with this unigram model by sampling from the distribution.  Some python code for that is below:</p>
<p><pre class="brush: python;">
# code for generating from a toy unigram character-based language model for English
import random

letter_counts = [(1, 'j'), (1, 'z'), (4, 'k'), (4, 'v'),
(5, 'b'), (6, 'g'), (7, 'f'), (8, 'c'),
(8, 'w'), (10, 'm'), (11, 'd'), (13, 'p'),
(16, 'u'), (17, 'y'), (19, 'a'), (21, 'h'),
(22, 'n'), (24, 'i'), (25, 'l'), (27, 's'),
(30, 'r'), (41, 'o'), (53, 't'), (60, 'e'),
(93, ' ')]

d = [letter_counts[0]]      # initialize the sampling list to the first item in the list of counts
tmp = ""

for i in xrange(1, len(letter_counts), 1):
d.append((d[i-1][0] + letter_counts[i][0], letter_counts[i][1]))
for i in xrange(526):
r = random.randint(0,525)
for letter_pair in d:
if r &amp;lt; letter_pair[0]:
tmp += letter_pair[1]
break

print tmp
</pre></p>
<p>So we take the list containing the counts of the letters (and the space character) and then create a new list where each element consists of a tuple.  The first element of the tuple is the sum of all counts of items in the list before it plus the count for the character, which is the second element in the tuple.  This is just a sum of all the previous letter counts.  Then we choose a random number and if the random number is less than the current sum, we add that letter to the output string.   The idea is that if we have chosen our probabilities correctly, we should get something that looks like English:</p>
<blockquote><p>  h spiyoweytnt berkctpllittdulwa nhesh oensnuvuwgyysyeamcgt fir v oe obnstotitres ttelsdmt e hr  liruorteevunaet uwyooe trewstpaayssers  onyewiasne r  oae yceu ruiz lorpse hg otderdretuk otlm odulloetnssaio   wtygre ro      gh eietvyrryeteetoat  au  y  r  vpte peascotog reiol cth enchl rpsn rtge saldi tngai iuentbdysbte eagrog nuee wetemsr srultnomisojhteolrtu t neyerhul io oe eeror dalts lr ib somunerno  situlcihi e a j ec eotutnerrnu  dpl   io prntb t yli  htsttiwmyop   oydstwihn h loeebleeh ovebnrtoyldothdgo  yto l bi</p></blockquote>
<p>And of course, it looks nothing like English.  The problem is that it doesn&#8217;t take the context of the letters into account.   If you see a word that starts with <em>t</em>, what do you think the next letter will be?  Probably <em>h</em>, right?  And what&#8217;s the next letter after that?  <em>E.  </em>A lot of other letters co-occur in English, like <em>ch</em> and <em>ou</em>, to name a couple.  So we can build a bigram model and this gets a bit closer to generating English-like words.  The problem then becomes sparsity.  We need a bigger corpus to accurately compute the probabilities for the bigram (2-gram) character based language model.  But here is what a bigram model would generate using this data:</p>
<blockquote><p> bey temm onowid eryobethigzee led hiorury tethp s nil sotr t iokd ngayle tt ndu s shtthefey ord  tus co ere rsbemotemo cetntpoioouayinpuybeskeenlievngrsen yp e  assfoan lhepud keleolerme ketlehuh or mene  pndy ng ds keoc eerryt ttris hea thea tjug s anhee li h l ye sthillked ou aiz iheer mu  aomheurt  se be ct s ompulun urhrasdserrs t cth the rttrireese  s has dtlo  lurd t hnu luesetnptellamofery rstye  hleoug hendisluupntwore sbethumouulf d r leitd ketib e zes io wu amerf rooua ketyomd al phesteaeyryntofe ioayalougyiti</p></blockquote>
<p>So, it looks a little better, but it&#8217;s still clogged with junk.  You&#8217;ll never get a model that produces all English words this way, but when you analyze enough English, you can reasonably predict the most common set of next letters you&#8217;ll use.  There are plenty of other methods for doing this and improving the results.</p>
<p>One of the things the SciAm blog mentioned was having to enter scientific text.  If they train the n-gram model using regular conversational English (or some corpus of email or something like that), scientific jargon is very unlikely to appear.  The result is a much slower entry process.  What they need to allow you to do is &#8220;create your own corpus.&#8221;  You add the documents that are best representative of the kind of stuff you&#8217;ll be writing and it builds a model from that.  Of course, it will need a lot of text to be really good, but it should start to work decently after a few hundred thousand words (or maybe even less).  I know you can detect which language a text is written in decently well with less than a hundred thousand words of training text per language.  If it allowed custom corpora for training, all it would need is the ability to switch language modes and you&#8217;re set.  Also, it could learn which choices you are more likely to make over time and start suggesting things that you choose more often.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/173/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/173/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/173/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/173/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/173/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/173/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/173/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/173/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/173/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/173/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=173&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2007/09/28/ultrafast-thumb-only-keyboard/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
		<item>
		<title>FreeAlert.org</title>
		<link>http://mendicantbug.com/2007/08/28/freealertorg/</link>
		<comments>http://mendicantbug.com/2007/08/28/freealertorg/#comments</comments>
		<pubDate>Tue, 28 Aug 2007 14:40:20 +0000</pubDate>
		<dc:creator>Jason Adams</dc:creator>
				<category><![CDATA[freealert]]></category>
		<category><![CDATA[friends]]></category>
		<category><![CDATA[non-profit]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[web design]]></category>

		<guid isPermaLink="false">http://mendicantbug.com/2007/08/28/freealertorg/</guid>
		<description><![CDATA[Last year I worked on a project with my friend Israel Kloss called FreeAlert. The site is not-for-profit and was originally intended to help refugees entering the Washington, DC area find things they need for free. It now covers major metropolitan areas all across the United States and is intended to benefit everyone. The idea [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=67&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Last year I worked on a project with my friend <a href="http://www.wheresmystapler.net/" title="Israel Kloss - Where's my stapler?" target="_blank">Israel Kloss</a> called <a href="http://freealert.org" title="FreeAlert" target="_blank">FreeAlert</a>.  The site is not-for-profit and was originally intended to help refugees entering the Washington, DC area find things they need for free.  It now covers major metropolitan areas all across the United States and is intended to benefit everyone.</p>
<p>The idea is simple.  Enter some keywords and get matching free items off of craigslist for your city sent to your cell phone.  You can enter up to 5 sets of keywords and each set has exclusion terms.  This makes it so that you can receive notices with the term <em>computer</em> but without the term <em>desk</em>.  Israel just took the site out of private beta last week and it is currently in public beta mode.</p>
<p>It was an interesting project for me because it gave me the chance to work in python on some http and smtp protocol code.  It also gave me the chance to work on processing xml and rss feeds.  Definitely some cool stuff there and it has resulted in a spin-off that will probably be functioning fairly soon.  Israel is one of those people with a lot of great ideas and he has the personality to inspire you with them.  Plus he is also one of those rare people that actually care enough about the suffering of others to actually try to do something about it, which you just have to admire.</p>
<p>So, please, check it out and let us know how we can make it better.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/ealdent.wordpress.com/67/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/ealdent.wordpress.com/67/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ealdent.wordpress.com/67/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ealdent.wordpress.com/67/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ealdent.wordpress.com/67/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ealdent.wordpress.com/67/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ealdent.wordpress.com/67/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ealdent.wordpress.com/67/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ealdent.wordpress.com/67/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ealdent.wordpress.com/67/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mendicantbug.com&amp;blog=1474857&amp;post=67&amp;subd=ealdent&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mendicantbug.com/2007/08/28/freealertorg/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ebec6abd2b9f1eb4de865aed01242171?s=96&#38;d=monsterid&#38;r=PG" medium="image">
			<media:title type="html">ealdent</media:title>
		</media:content>
	</item>
	</channel>
</rss>
