RESOURCE: Mining Large Datasets for the Humanities

Peter Leonard (Yale University) presented a paper, “Mining Large Datasets for the Humanities” at the International Federation of Library Associations and Institutions (IFLA) World Library and Information Congress in Lyon, France. The abstract states:

This paper considers how libraries can support humanities scholars in working with large digitized collections of cultural material. Although disciplines such as corpus linguistics have already made extensive use of these collections, fields such as literature, history, and cultural studies stand at the threshold of new opportunity.

Libraries can play an important role in helping these scholars make sense of big cultural data. In part, this is because many humanities graduate programs neither consider data skills a prerequisite, nor train their students in data analysis methods. As the ‘laboratory for the humanities,’ libraries are uniquely suited to host new forms of collaborative exploration of big data by humanists. But in order to do this successfully, libraries must consider three challenges:

How to evolve technical infrastructure to support the analysis, not just the presentation, of digitized
artifacts.
How to work with data that may fall under both copyright and licensing restrictions.
How to serve as trusted partners with disciplines that have evolved thoughtful critiques of quantitative and algorithmic methodologies.

dh+lib Review

This post was produced through a cooperation between Leigh Bonds, Nickoal Eichmann, Erica Hayes, Mike Hesson, Meredith Levin, Jennifer Millen, Martín Pozzi, Amy Wickner (Editors-at-large for the week), Zach Coble (Editor for the week), Sarah Potvin (Site Editor), and Caro Pinto and Roxanne Shirazi (dh+lib Review Editors).