RECOMMENDED: State of the Union—and Corpus Comparison 1

In advance of Tuesday’s State of the Union address, Benjamin Schmidt (Northeastern University) and Mitch Fraas (University of Pennsylvania) created a series of interactive graphics for The Atlantic that allow readers to explore the State of the Union addresses of every U.S. president: The Language of the State of the Union and Mapping the State of the Union.

Schmidt has written a blog post pointing to an additional tool designed to “compare and contrast language spoken by Presidents in the State of the Union” side-by-side. He explains why it could be an exciting example of online text analysis that shifts focus away from topic modeling and towards leveraging the rich metadata that libraries (and others) already have:

For the State of the Union, there are all sorts of useful comparisons to make: president vs. president, republican vs. Democrat, lame duck vs recently elected, opposition congress vs. friendly crowd… And for every other corpus, there are just as many. We currently treat these kinds of analytics as things that should be run client side, requiring individuals to obtain digital texts (frequently impossible) and install and run some tools for corpus comparison (a high barrier to entry.) But libraries and other content holders can–and I would argue, should–support these things as a form of exploration out of the box.

Just as libraries have provided search functions across and within collections, Schmidt envisions “real-time, fully customizable in-browser comparison across any facets of a corpus as a service libraries and other content providers can easily offer on medium-sized (c. 20,000 documents) corpora.”

dh+lib Review

This post was produced through a cooperation between Nickoal Eichmann, Kevin Gunn, Alix Keener, Amy Rubens, Samuel Russell, Amy Wickner (Editors-at-large for the week), Roxanne Shirazi (Editor for the week), Sarah Potvin (Site Editor), and Zach Coble and Caro Pinto (dh+lib Review Editors).

One comment on “RECOMMENDED: State of the Union—and Corpus Comparison

  1. Ben Schmidt Jan 22,2015 4:18 pm

    Thanks so much for featuring this.

    Your readers may also be interested in a fourth site I haven’t exlained as fully yet that allows contextual reading of the full texts of state of the unions. Unlike the other ones, this is a place for traditional reading with search/quantitative comparisons and contextualizations as a supplement. That reverses the normal order of search engines, where drill through context in order to read.

Comments are closed.