RESOURCE: Data Mining the Internet Archive Collection

A new lesson by Caleb McDaniel on The Programming Historian focuses on downloading and analyzing records from the Internet Archive.

From the description:

In this lesson, you’ll learn how to download files from such collections using a Python module specifically designed for the Internet Archive. You will also learn how to use another Python module designed for parsing MARC XML records, a widely used standard for formatting bibliographic metadata.

Author: Roxanne Shirazi

Roxanne is the Dissertation Research Librarian at the Graduate Center, CUNY.

One thought on “RESOURCE: Data Mining the Internet Archive Collection”

Comments are closed.