[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re:OCR - 100% not even close
Date: Mon, 17 Jan 2000 17:11:57 -0800
From: Wright Huntley <huntley1 at home_com>
Subject: Re: January BNL cool article - why not on the web page?
Klotz Theodore wrote:
> Glad you enjoyed the article.Actually there has been some discussion on
the> board about reprinting some things in the BNL and on the web site. It
has> only been discussion at this point Perhaps some of it could be published
> JAKA also. There are many wonderfull articles in JAKA over the years.
> Understand that a project such as this would entail someone having all the
> issues and and also compiling a complete list of all that has been
> published. Each article under consideration would then have to be
> transcribed in some way.
> A rather large task. Any volunteers?
> Flames on! Boy am I in trouble for bringing this out . HeHe
-The task is fairly simple, but quite tedious at the -first step. These days,
- printed matter can be run through OCR from a decent -flatbed scanner with
- virtually 100% accuracy. The mechanics of scanning -would be a boring pain,
- though, for doing all of JAKA.
I work with document processing everyday. I have yet to see an OCR program that would even come close to 75-80% accuracy. The only way current OCR programs even approach this is with a freshly typed or printed page with a standard fixed width font. I am sure the old articles that are being dicussed here for publication are in anything BUT good reproducible quality. I'd bet you'd be hard pressed to even get a good photocopy let alone a copy good enough for an accurate OCR job.
Unless of course those documents are of pristine quality, which is doubtful.
Just my $.02 US $.00015 CAN <BWG>
See http://www.aka.org/AKA/subkillietalk.html to unsubscribe