Ammon Shepherd (University of Virginia) has shared the code and instructions for batch watermarking and OCR-ing images. During the course of archival research, Shepherd needed a way to extract text from images of book pages and to add watermarks indicating the source of an image. As a result, he created a script – both a bash script and in Ruby – that uses ImageMagick to add the watermark and tesseract to OCR the images.
dh+lib Review
This post was produced through a cooperation betweenKatrien Deroo, Alix Keener, A. Miller, Pamela Mitchem, Meghan Sitar, and Patrick Williams, (Editors-at-large for the week), Zach Coble (Editor for the week), Sarah Potvin (Site Editor), and Caro Pinto and Roxanne Shirazi (dh+lib Review Editors).