Old English Translator

Posted: 8 November 2007 in Uncategorized
Tags: , , , , , , ,

Phil Barthram recently announced on the ENGLISC mailing list a new Old English translator. For those unfamiliar with Old English, this is not the really cheap malt liquor. This is the grandmother of Modern English (by way of its mother, Middle English and a few others, chiefly Norman French). Whereas an Olde English (the malt liquor) translator might look like this:

“You look pretty.”
“I’m trashed on cheap swill.”

an Old English (Anglo-Saxon) translator looks more like:

Nu sculon herigean heofonrıces weard
Now we should praise the guardian of the kingdom of heaven

This is the first line of Cædmon’s Hymn. Check out the wikipedia page for Cædmon to read the whole nine lines.

The tool offers the user the ability to get dictionary look-ups of inflected Old English words. So this is a word-for-word translator with no regard to context. So in the realm of Old English machine translation, this is the first step. This is called a direct translation system in the hierarchy of MT since it looks only at words at not at syntax or semantics when offering translations. A sister method is statistical machine translation, which looks at co-occurrence probabilities between the source and target languages (Old English/Modern English) to suggest word and phrase matches.

I’ve been considering for a while now working on such a system for Old English as a pet project. Lack of time is the major hurdle there. I’ve also been continuing (slowly) to work on a morphological analyzer for Old English verbs (and extending it to nouns, adjectives, etc).

The way Phil handles morphology is in the pre-processing phase. He has taken several Modern English to Old English (and vice versa) dictionaries and extracted inflected forms from the format they encode. He then populates the database with each inflected form as a separate entry, tagged with the proper morphological information. At query time, he checks for variations in acutes and also returns similar matches.

Unfortunately (for me), he does not intend to take the code open source. I certainly understand his desire to keep this as his own project, and he has put a lot of work into it. I hope at some point he’ll release the data as an xml dictionary. One of the problems with open source projects for the hobbyist is the sudden overhead in managing the project combined with the fact that niche creations like this don’t attract very many collaborators. So you get more work with no benefit.

Using Phil’s translator to translate the first line of Cædmon’s hymn:

Interjection
Derived from: lo! behold! come! ~ lá now

sculon (failed to translate)

herigean (failed to translate)

heofon Masculine Noun – irregular ending
Derived from: heofon m (-es/heofenas) f (-e/-a) sky firmament heaven the power of heaven
Case(s) with this inflected ending:
> Nominative Singular
> Accusative Singular

rices (failed to translate, genetive form of rice, which did translate)
rice Strong Neuter Noun
Derived from: rice n (-es/-) BT la. add :– On middeweardum hire rice hió getimbrede
Case(s) with this inflected ending:
> Nominative Singular
> Nominative Plural
> Accusative Singular
> Accusative Plural

weard Strong Masculine Noun
Derived from: weard m (-es/-as) keeper watchman guard guardian protector 2 lord king 2 possessor
Case(s) with this inflected ending:
> Nominative Singular
> Accusative Singular

So Phil still has some work to do (and I know he’s going to be working on cleaning up the UI, since up to now he’s been focused on the underlying stuff).

About these ads

Comments are closed.