Windows Free OCR

Did you know that XP has built in OCR? And it is actually good? I scan some books, but I also use my digital camera to capture a lot of books and archival research , which IOCR text in XP can carry on my laptop as .jpgs. Using the Portable Apps version of Gimp, I can convert those to .tif files, which the Microsoft Document Imaging tool will open. Open the file, select the text, hit the OCR button. and voila! I don’t know how well it will work with typsecript from the archives, but it seems to do very well on most modern books. Here is a screen shot, linked back to a .png so you can see the results.

The actual page is from Raymond Smith’s book, the Fighting Irish in the Congo. I did my phd on the Irish with the UN in the Congo, from 1960 to 64. Smith was out there as a journalist, and while his book is not great, I needed to read it for the thesis. I spent years hunting for it, and know of only three extent copies in Ireland. Smith is dead, the book is out of print, and never likely to be back in print but now that I no longer really need it, I have it, and if I wanted to, I could OCR the whole thing – and maybe I will, since it is, in a way, a primary source.


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php