RECOMMENDED: “The Literary”: Digital Humanities Quarterly (Issue 2013 7.1)

The latest issue of Digital Humanities Quarterly, edited by Lisa Swanstrom and Jessica Pressman, is devoted entirely to “The Literary,” and contains several articles of interest to the library and archives community. From the introduction to the issue:

As the essays in this issue demonstrate, the conjunction of the literary and the digital humanities produce a rich set of provocations: What kind of scholarly endeavors are possible when we think of the digital humanities as not just supplying the archives and data-sets for literary interpretation but also as promoting literary practices with an emphasis on aesthetics, on intertextuality, and writerly processes? What kind of scholarly practices and products might emerge from a decisively literary perspective and practice in the digital humanities? The essays in this issue engage with these questions and demonstrate ways of answering them.

A few of the articles most relevant to libraries and archives:

The .txtual Condition: Digital Humanities, Born-Digital Archives, and the Future Literary
Matthew Kirschenbaum, University of Maryland

Whence Feminism? Assessing Feminist Interventions in Digital Literary Archives
Jacqueline Wernimont, Scripps College

Digital Humanities, Copyright Law, and the Literary
Robin Wharton, Independent Scholar

RECOMMENDED: “A Map and Some Pins”: Open Data and Unlimited Horizons

Tim Sherratt (@wragge on Twitter) has published his keynote address to April’s Digisam Conference in blog form, in which he makes an inspired and passionate case for making cultural heritage data openly available. Sherratt reminds us that this data is infused with history and “resists our attempts at reduction,” while calling into question the notion that digital methods of exploring cultural heritage are extractive:

The glories of messiness challenge the extractive metaphors that often characterise our use of digital data. We’re not merely digging or mining or drilling for oil, because each journey into the data offers new possibilities — our horizons are opened, because our categories refuse to be closed. These are journeys of enrichment, interpretation and creation, not extraction.

We’re putting stuff back, not taking it out.

He goes on to say:

What this means for cultural institutions is that the sharing of open data is not just about letting people create new apps or interfaces. It’s about letting people create new meanings. We should be encouraging them to use our APIs and LOD to poke holes in our assumptions to let the power pour out.

 

RECOMMENDED: Data curation as publishing for digital humanists

Text and slides from a talk delivered by Trevor Muñoz, Assistant Dean for Digital Humanities Research at the University of Maryland Libraries, at the CIC Center for Library Initiatives conference. Muñoz presents an intriguing synthesis of a couple of growing trends in libraries – data curation and publishing. Data curation here is defined as “information work that integrates closely with the disciplinary work practices and needs of researchers in order to ‘maintain digital information that is produced in the course of research in a manner that preserves its meaning and usefulness as a potential input for further research.'” Muñoz argues that “data curation work would also be ‘publishing’ in the sense of ensuring quality and disseminating outputs to interested communities…By recognizing data curation work as a publishing activity, libraries would have a ‘market opportunity’ to address unmet needs in the digital humanities community.” More broadly,

Data curation as a “publishing” activity is increasingly relevant to the working lives of digital humanities scholars. Moreover, articulating connections between “publishing” and data curation is important in the context of strategic decision libraries might make and, in fact, are making about how to participate in “publishing.” Data curation as publishing is publishing work that draws directly on the unique skills of librarians and aligns directly with library missions and values in ways that other kinds of publishing endeavors may not.

RECOMMENDED: The Joy of Topic Modeling

Matt Burton, graduate student at the University of Michigan School of Information, provides an accessible introduction to topic modeling. Aimed at beginners (though useful for everyone), the article unpacks the meaning of the terms used in topic modeling, such as model, word, document, topic, tokenization and stemming. For example,

At the start of any text mining adventure, the natural sequences of words, the sentences and paragraphs of written documents are broken up via a process called tokenization. Individual words become unigrams or individually unique tokens. Tokens are not always equivalent to words because the tokenization process may count two or more words together as a single token, creating what are called bigrams or ngrams. For example, the words “digital humanities” could be a bigram or two individual unigrams, “digital” and “humanities.” Tokenization is more of an art than a science, it requires subjective decisions as well as domain understanding of the texts being processed.

Burton also describes the pros and cons of different types of generative topic models, and ground his discussion in a topic model that uses the text of 10 posts that were featured on Digital Humanities Now.

RECOMMENDED: The Poetics of Non-Consumptive Reading

Building off of the amicus brief filed by Matthew Jockers, et al. in  Authors Guild vs. Google,  Mark Sample (George Mason University) has written a provocative post urging digital humanists to think critically about what it means to frame non-consumptive use–text-mining, topic modeling, etc.–as “non-expressive.” As the brief’s abstract explains:

The brief argues that, just as copyright law has long recognized the distinction between protection for an author’s original expression (e.g., the narrative prose describing the plot) and the public’s right to access the facts and ideas contained within that expression (e.g., a list of characters or the places they visit), the law must also recognize the distinction between copying books for expressive purposes (e.g., reading) and nonexpressive purposes, such as extracting metadata and conducting macroanalyses.

Sample wants scholars to go futher, arguing that the future of digital scholarship is dependent on the expressive use of such research:

Scholars and students of art, literature, history, and culture ought to transform more of our non-consumptive research into expressive objects. Nonexpressive use of texts is a dead-end for the humanities. A computer model surrounded by a wall of explanatory words is not enough. Make the computer model itself an expressive object. Turn your data into a story, into a game, into art. Call it aesthetic empiricism or empirical aesthetics. Call it whatever you want. But without a poetics of machine reading, there is nothing.

Incidentally, the Authors Guild have appealed the 2012 decision ruling in favor of non-consumptive use, and the authors of the original amicus brief have issued an appeal for support in drafting a new brief to be submitted to the Appeals Court.

RECOMMENDED: U.S. Open Data Policy

On May 9, 2013, the U.S. government issued Executive Order 13642, declaring that “the default state of new and modernized Government information resources shall be open and machine readable.” The announcement coincided with a memorandum outlining the creation of an open data policy that requires government agencies “to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine­readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts.”

Taken with the Open Knowledge Foundation’s release of the CKAN data management system and OKFN’s announcement that CKAN will be used for data.gov, these moves set a strong example for other organizations to support open data initiatives. The release of government data will be a boon for librarians, digital humanists, and many others, and we look forward to seeing new projects that take advantage of this data.

RECOMMENDED: #dhpoco Open Thread

The Digital Humanities as a Historical “Refuge” From Race/Class/Gender/Sexuality/Disability?

Sparked by David Golumbia’s recap of the “Dark Side of the Digital” conference (#c21dsd) at the University of Wisconsin-Milwaukee, Postcolonial Digital Humanities posted an open thread on the issues of race, gender, class, sexuality, and disability in DH. The thread has generated over 150 comments, providing a spectrum of viewpoints and a lively discussion of how the “yack” side of DH influences the “hack” side and vice versa. One of the many resonant points comes from Alan Liu, who acknowledges the broader forces at play in the discussion while addressing how the field of DH can affect change:

What do I as a digital humanists want to teach my students? I want them to come out of university with the intellectual methods and technical skills needed to interoperate across the institutions and professions for which they are headed. But I want them also to have retained enough of a comparative sense of the differences in premises and identities vested in society’s institutions and professions that they can enter that fray as what we used to call “well-rounded” human beings. … The digital humanities can really be a sweet spot for teaching such differences–encoded, as it were, as low in the stack as how databases are appropriately used in different social contexts.

RECOMMENDED: DH Genealogies and the Academy (Weekend Round-Up)

“I just got done with a good twenty-four hours of arguing with people online about digital humanities.”

So begins Stephen Ramsay’s post, “DH Types One and Two,” written in response to a flurry of conversations that took place online over the weekend. In part a reaction to Daniel Allington’s “Managerial Humanities: or, Why the Digital Humanities Don’t Exist” piece that garned so much attention last week, and perhaps influenced by the conversations taking place at the “Dark Side of the Digital” conference (#c21dsd) at the University of Wisconsin-Milwaukee, Ramsay and others engaged in several discussions relating to the origin of DH, its transformation over the years, and its relation to the future of the academy. (A few have noted that most of the discussion on Twitter took place among men, for what it’s worth.)

Much of the conflict over digital humanities, as the conversations addressed it, can be boiled down to these two tweets by Ramsay:

As often happens, the debate led back to varying definitions of DH, and it is in this context that Ramsay’s post emerges. He lays out a genealogy that labels the early Humanities Computing community as DH Type I, and describes a later formulation (DH Type II) in which DH “became a signifier both for a very broad constellation of scholarly endeavors, and for a certain revolutionary disposition that had overtaken the academy.”

This garned an interesting response from Michael J. Kramer, playfully titled, “Attack of the Alt-Acs,” in which he builds upon Ramsay’s classification by noting the emergence of the alt-ac community, which he names Type 1.5:

I wonder if the disconnects, the talking past, between type 1 and type 2 dh hinge on the historical emergence of type 1.5, which absorbed and cannibalized earlier practices of humanities computing, but also linked the digital to larger, very fraught and vexing struggles over intellectual labor and work under neoliberalism, corporatization, and privitization in the US and beyond.

Kramer’s post, in turn, led to Andrew Prescott’s “Small Worlds, Big Tents,” in which he responds to the discussion from an international perspective, noting that “The ‘alt-ac’ and tenure discussions are an illustration of the way in which local problems in the structure of higher education in the United States are somehow represented as an existential crisis for humanity.”

Prescott goes on to address the role of information professionals and non-academic DH practitioners:

In addition, there is the issue of the digital humanities developer – the person who wants to spend a career creating DH resources, not necessarily pursuing their own scholarly vision or analysing the digital reshaping of scholarship. The developer is a key part of DH, but no one has effectively worked out how good career paths of this sort can be provided in a DH department. In fact, we run a terrible risk in many DH units of imposing precisely the sort of academic/ professional apartheid that DH should be explicitly reacting against.

A lot can happen over the weekend. The conversation continues, and we encourage our readers to add to our round-up–and, perhaps, contribute their perspectives–in the comments section. For now, we’ll leave off with another quote from Prescott’s piece as food for thought:

I have moved between librarianship and academic positions throughout my career. I have generally found librarianship to be a more creative, intellectually stimulating, fast-changing and satisfying activity than conventional academic work.

 

RECOMMENDED: The Digital Humanities Contribution to Topic Modeling

Elijah Meeks and Scott Weingart are guest editors for the latest edition of Journal of Digital Humanities, which is devoted entirely to topic modeling (Vol. 2, No. 1 Winter 2012). Topic modeling is a method of textual analysis that has gained popularity in the humanities in the last few years. It uses computer algorithms to find patterns in large corpora of texts, allowing researchers to examine the thematic structure of large sets of documents. As Meeks and Weingart point out in their article introducing the issue, use of topic modeling in the humanities has been increasing since about 2010 but the scholarship around it remains dispersed:

In this additional way topic modeling typifies digital humanities: the work is almost entirely represented in that gray literature. While there is a hefty bibliography for spatial analysis in humanities scholarship, for example, in order to follow research that deploys topic modeling for humanities inquiry you must read blogs and attend conference presentations and workshops. For those not already participating in the conversation, this dispersed discussion can be a circuitous and imposing barrier to entry. In addition to sprawling across blogs, tweets, and comment threads, contributions also span methods and disciplines, employ sophisticated visualizations, sometimes delve into statistics and code, and other times adopt the language of literary critique.

The article goes on to outline the structure of the issue’s contents and provides a useful introduction to concepts and tools along the way.

RECOMMENDED: OA in the USA

The White House, responding to a We the People petition from May 2012, announced Friday that federal agencies with more than $100 million in research will make federally funded research and data sets freely available to the public within 12 months of publication. The Fair Access to Science and Technology Research Act (FASTR), which contains many of the same provisions, is still moving through Congress, and its passage is important to make permanent the gains achieved through the White House memo. ACRL Insider has a good post with many helpful links. While many details are still unclear, and the status of FASTR is still in the air, this is a major step forward for open access in the US. Huzzah!