Skip to content

IMPACT Final Conference – Case study: Scanning Parameters

24 October, 2011

Apostolos Antonacopoulos’ (University of Salford) session presented and analysed the effects of scanning parameters on OCR quality, as well as the issues regarding storage and maintenance costs for Content Holders. Different experiments were carried out in order to establish scanning effects on OCR quality, including colour vs greyscale vs bitonal, effects on resolution and the comparison with images from the National Library of New Zealand (NLNZ).

The images selected for the project were taken from the British Library newspaper collection and varied in quality. To ensure optimal results, only text regions were selected, thus ignoring additional artefacts (e.g. warping). The IMPACT tool Aletheia was used to extract and key the text to be represented  and ABBY Fine Reader 9 Engine software was used for the OCR process.

Overall, word accuracy improvements were more apparent when using colour, bitonal and 4 and 8-bit scanners while dithered scanners produced the lowest results, with 1.64% word accuracy.

In conclusion, Mr Antonacopoulos stressed the importance of investing in high quality images as they leave room for improvement and can be reused without the need to re-scan. However, different decisions should be taken for different document types.

View presentation here:

and the video here:

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: