RESOURCE: Data Mining the Internet Archive Collection 1

A new lesson by Caleb McDaniel on The Programming Historian focuses on downloading and analyzing records from the Internet Archive.

From the description:

In this lesson, you’ll learn how to download files from such collections using a Python module specifically designed for the Internet Archive. You will also learn how to use another Python module designed for parsing MARC XML records, a widely used standard for formatting bibliographic metadata.

dh+lib Review

This post was produced through a cooperation between Jefferson Bailey, Jolie Braun, Heather Martin, Jolanda-Pieta van Arnhem, and Krista White (Editors-at-large for the week), Roxanne Shirazi (Editor for the week), Sarah Potvin (Site Editor), and Zach Coble and Caro Pinto (dh+lib Review Editors).

One comment on “RESOURCE: Data Mining the Internet Archive Collection

  1. Pingback: RESOURCE: Data Mining the Internet Archive Collection | Digital Humanities Now

Comments are closed.