I've been reading the postings about this project with great enthusiasm.
Lack of dictionary resources (in English) has been really bugging me.
Here's one case where using linux worked in my favour. When I tried
to read these tiff's, my browser told me it couldn't display them, and
the name of the commonly used tool for displaying this format. I downloaded
and installed the tool (xv), which was open source, and can now read them.
The text looks a little 'fuzzy', but quite readable.
I then tried to access them from win95, using internet explorer. No error
message, and no text ... just a graphic, that looked like some kind of movie
film. I _think_ it was supposed to represent a 'broken' film, and this graphic
was supposed to indicate to me that IE was unable to understand the format,
and mistakenly believed this was due to file corruption. However, this is
only a guess; if I hadn't been able to view the pages from linux, I would
have thought the site was really messed up, and the links simply pointed
to a silly graphic.
By the way, with a somewhat different index page (giving start and end words,
not just page number), these scanned pages would be quite useful to me _as is_,
without the extra effort (and potential errors) of transcribing into html.
(In fact, they are useful even indexed by page number, since I can make
my own index.)
And given the potential for transcription errors, I think it would be a good
idea for the eventual html version of these pages to include links to the
corresponding scanned version, so users can check for themselves whether
something that seems "odd" is an OCR/transcription problem, or present in
the printed dictionary (where it could, of course, have likewise been
a misprint.)
If there's going to be a mass project to transcibe and/or correct OCR
output for these pages, I'd be happy to volunteer to assist, though I may
not have as much time for the project as I'd prefer.
--
Arlie
(Arlie Stephens
arlie@...)