Lineart “Spicks and Specks” remover for scanned text

splash

The task of removing the spots from a lineart scan is a boring task, but a necessary one when trying to create an identical copy of previously printed text.

The usual way of minimising the clean-up of rogue dots is to scan the original as a 1200 dpi greyscale, and using combinations of levels and curves to remove the highlights and emphasise the shadows, then convert the greyscale to lineart using the 50% Threshold. Nevertheless, sometimes there are stubborn dots that won’t go away with this process. Also, scanning hundreds of images at 1200 dpi in greyscale (so the images can be 1:1 converted to lineart using the 50% Threshold) requires lots of memory and hard drive space, so in this brief the pages were scanned as lineart images.

One particular brief was to recreate a novel exactly as it had been printed previously, but as the book was last printed 15 years ago, the native files were no longer available. While the cover could be re-set, the black and white text had to be scanned in, rather than use Optical Character Recognition (OCR) and format the text into a new InDesign (this would take too long and require many proofs). The client gave permission to cut the cover from the book so that the text could be fed through the scanner’s Auto Document Feeder (ADF).

A script was already produced that would remove rogue dots –  but sadly did not work with the latest versions of Photoshop. An answer turned up in a post to the Adobe Forums when Evgeny Trefilov (17th post in) presented a filter he had made. Initially, it too did not work with the latest version of Photoshop, but a 64-bit plug-in was created to work with CS6 and above. Evgeny’s plugin does require the images to be greyscale.

lieartfixsettings

Above: the user interface of Evgeny’s plug-in. Set the Threshold, Max Value, Block Size and C value as above, but to fine-tune the script so that dots don’t disappear above “i”s or letters don’t fill in, adjust the two red sliders until the desired results are achieved.

For this brief, a sample file  was used (out of the many that were scanned) to test Evgeny’s filter, and once refined to make sure that only the rogue dots were removed but other larger dots preserved (e.g. the dot in the letter “i” or a full-stop), an action was made in photoshop to:

  • Convert from lineart to greyscale;
  • Run Evgeny’s plug-in;
  • Convert from greyscale back to lineart;
  • Save and close.

This plugin (download here) and action worked well and certainly saved dozens of hours of removing rogue dots from lineart scans of text.

It should be stressed that this plugin was appropriate for scanned lineart text, but as for cleaning up illustrations, diagrams or photographs, be careful as the incorrect settings can have major consequences.

So correcting the scanned pages is one thing, while one of the techniques from this article was used to place the pictures into a new InDesign file.

This site uses Akismet to reduce spam. Learn how your comment data is processed.