You might think there’s not much more Optical Character Recognition (OCR) software can do. As long as it recognises documents accurately and reasonably fast, where’s the scope for improvement? In fact, Nuance claims several notable improvements for OmniPage Professional 16, probably the best-known OCR application on the market.
First, says Nuance, the new version is between 16 and 27 per cent more accurate than before, while at the same time being up to 46 per cent faster. On top of this, it should be able to compensate for lens distortions in pictures of pages taken with a camera, automatically black out words in sensitive documents and handle electronic and paper forms. It can produce documents in Office 2007’s XPS format and includes copies of both PaperPort 11 (Nuance’s document management application) and PDF Converter 4 which, as you might guess, converts documents to PDF format.
The program is also claimed to make a better job of producing accurate representations of pages, without putting everything in separate text and graphics frames. This has long been a gripe, as it’s one thing to have the page look right, but another to easily edit the text within that layout. Most OCR programs struggle with the ‘easy edit in layout’ part.
Once you’ve installed and activated OmniPage Professional 16, you have to set up a scanner to work with it. The Scanner Setup Wizard should run automatically, though in our case it didn’t. The Wizard downloaded the latest scanner database from Nuance, which didn’t include our HP OfficeJet 7210, a current and popular All-in-One. We had to run the program’s diagnostics to get it recognised, which involved scanning text, grey scale and colour documents – about five minutes work.
The main processing screen offers four main task tabs at the top with three panes below; one for thumbnails, one for a graphic image of the page and one for the OCRed text. At the bottom is a full-width pane of document statistics, most of which OmniPage works out for itself.
The tabs are for workflow, load or scan type, page layout and export. Despite what Nuance seems to think, they are not that intuitive to use. As if an admission of this, a series of How-to-Guides runs through many of the tasks which should be obvious, but aren’t. Unexpectedly, the default 1-2-3 workflow, designed to handle the most common OCR tasks automatically, is set by default to load images from file – is that really where most customers want to get their input documents? You have to alter this behaviour before the program starts to look to a scanner, instead.
Having scanned a document and recognised its characters, the program then proof read it and claimed 100 per cent accuracy, even though there were two instances of the same typo in the text. Just because the reading of ‘tor’ as ‘for’ still gives a legitimate word, doesn’t mean it’s accurate.
OmniPage did complete the recognition in just over two seconds, which is quick, and even a more complex page with graphics and boxed text took under 10 seconds. This page needed more preparation before we could get an editable document with a reasonable likeness of the original. We had to outline the areas of the page which we wanted treated as text, rather than leaving OmniPage on automatic.
Even here, there are noticeable discrepancies. Some are understandable, like misreading of coloured text from the original, while others, such as differences in font and text style, are less acceptable. Some of the text has been put in boxes on the Word 2003 page we created, while the rest is made into the body text. Furthermore, there are a variety of indents and line spaces, even though all the text has the same left-hand margin in the original.
It’s easy enough to save OCRed documents in any of the supported file types, including Word 2007’s docx, Adobe’s PDF, WordPerfect X3, and WAV for audio reproduction. The text-to-speech conversion is particularly good and, although US accented, sounds comparatively natural and expressive.
If you don’t need PaperPort or PDF Converter and can do without some of the more corporate features of OmniPage Professional 16 like OCR of forms, blacking out (aka redacting) words and the Batch Processing Manager, then the standard OmniPage 16 costs around £60 – a big saving over the Professional version.
The improvements flagged up for OmniPage Professional 16 would all be useful, but from our tests, there’s still some way to go for the software to fulfil them. For batch processing of long, standard text documents there’s little doubt the software can save a lot of time, but for more complex pages where there’s significant graphics content it can still struggle to get close to what you scanned.
Score in detail
Unlike other sites, we thoroughly test every product we review. We use industry standard tests in order to compare features properly. We’ll always tell you what we find. We never, ever accept money to review a product.
Tell us what you think – send your emails to the Editor.