Greetings,
One typical algorithm for encoding dictionary entries is as you suggest a
tree data structure, (it is not binary however). In this approach there is
no redundancy. So words that are prefixed substrings of other words use the
same storage.
The encoding for the word 'play' would consist of a series of 4 connected
nodes.
p -> l -> a -> y
The 4th node containing the letter 'y' would have additional nodes attached
to it for words like 'playing' 'plays' and 'played' but the original 4
letter sequence: 'play' would only be stored one time.
p -> l -> a -> y
-> e -> d
-> r
-> i -> n -> g
-> s
Additional information beyond the letter value may be stored in the nodes
including for instance a code that indicates that it is valid to make the
'y' node a terminating one. In effect, establishing that 'play' is a valid
entry in the dictionary (and presumably the langauge)...
This is a simple example of course. The 'p' node would have many branches
under it: one for each vowel and one for any other letter that occurs in the
language following 'p'.
I'll post separately on Morphological Parsing.
Regards,
Tom Wulf
-----Original Message-----
From:
keth@... [mailto:
keth@...]
Sent: Tuesday, August 07, 2001 6:47 AM
To:
norse_course@yahoogroups.com
Subject: Re: [norse_course] Database Project (was Re: Making this List more
Useful
Hello Steven!
Your method of learning Old Norse is obviously to create
software that mimics the morphology. Well, that certainly seems to be
as valid a way of learning as any other. It also indicates a certain
paradigm shift: "Understanding something means that you can write
software that mimics it."
>That should give you some idea where I'm going. One place that looks like
it
>will be some work to figure out is the adverbs.
But the adverbs aren't declined, are they?
So they would be the easiest words to deal with.
>I believe they can be broken
>down in a way similar to the nouns. I'm simply going through Gordon's
>*Accidence* chapter and trying to map things out until I have a place for
all
>the words. I'm sure things will come to mind as I go along. I would very
>much like to be able to link to and from a dictionary, but that is way down
>the road. I would also like to add some descriptive text similar to that
>found in Gordon. I can't take too much directly from his book lest I
commit
>plagiarism.
I have compared many Old Icelandic grammar books.
Gordon's is among the shorter ones. It is very
well done, and gives a very good overview, but
compared to the others it is a bit on the short side.
Merely from the observation of its brevity, I think
one can draw certain conclusions about its adequacy,
and that is that you will discover sooner or later that
it is not complete. (you will find there is a lot
of grammar problems that you meet in practical work, that
it fails to answer)
The most complete one I have seen is the one by Adolf Noreen.
(Swedish linguist), But it is of course much more difficult
than Gordon's book. (btw the "grammar" in Gordon is only
a small section of his book, which is primarily an ON reader,
but with a rather well written reference section for the grammar)
I think it is definitely a book that is very well suited for
course work, where the instructor gives weekly (or daily?)
assignments, and gives hints about how to use the reference
section for solving the assigned problems. (for example
weekly hand in problems would be a good way to run such a course)
But it is not meant as a book for self study. (unless you
are already well versed in grammar studies from other contexts)
Here is a simpler project:
Create an Old Norse spelling checker.
Funny thing:
I tried to look at the data file that is used by one of my
English spell checkers. BUT: I did *not* find the words
I had entered in the file. Apparently the spell checkers use
a kind of algorithmic/numeric/binary tree-model for storing
the data about what words are "valid" words in the language.
Does any one know what the algorithm is?
Best regards
Keth
Sumir hafa kvæði...
...aðrir spakmæli.
- Keth
Homepage:
http://www.hi.is/~haukurth/norse/
To unsubscribe from this group, send an email to:
norse_course-unsubscribe@egroups.com
Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/