POST: Libraries and Archivists Are Scanning and Uploading Books That Are Secretly in the Public Domain

This Motherboard post by Karl Bode details efforts of archivists, activists, and libraries to vastly expand the number public domain books that are being digitized, with particular emphasis on books published between 1923 and 1964.

“As it currently stands, all books published in the U.S. before 1924 are in the public domain, meaning they’re publicly owned and can be freely used and copied. Books published in 1964 and after are still in copyright, and by law will be for 95 years from their publication date. But a copyright loophole means that up to 75 percent of books published between 1923 to 1964 are secretly in the public domain, meaning they are free to read and copy.”

The New York Public Library is leading the effort to identify appropriate titles, digitize them, and upload them to the Internet Archive. Using Python scripts to automate parts of the process, organizers and volunteers are striving to do this work at scale, including verifying that copyright was not renewed. Volunteers from Project Gutenberg and other organizations “are tasked with locating a copy of the book in question, scanning it, proofing it, then putting out HTML and plain-text editions.”

DH library folks might want to keep an eye on these efforts in order to help faculty and student access a broader range of texts, including computationally-ready plain-text files, to engage in textual analysis and other DH work.

dh+lib Review

This post was produced through a cooperation between Conor Dugan, Tierney Gleason, Jill Krefft, Amy Mallory-Kani, Jennifer Matthews, Adam Mazel, and Kristen Totleben (Editors-at-large for the week), Pamella Lach (Editor for the week), and Caitlin Christian-Lamb, Nickoal Eichmann-Kalwara, Linsey Ford, and Ian Goodale (dh+lib Review Editors).