TaDiRAH: Building Capacity for Integrated Access

In this post, Quinn Dombrowski (UC Berkeley) and Jody Perkins (Miami University in Ohio) introduce the digital humanities taxonomy project known as TaDiRAH, reviewing the motivating factors behind its inception and outlining future goals of the project. Both are members of the TaDiRAH Coordinating Committee.

DiRT, DARIAH-DE, DHCommonsTaDiRAH, the Taxonomy of Digital Research Activities in the Humanities, is the result of a year-long project undertaken by the DiRT (Digital Research Tools) Directory and DARIAH-DE (Digital Research Infrastructure for Arts and Humanities) to develop a shared taxonomy that can be used to organize the content of sites as diverse as the DARIAH Zotero bibliography ‘Doing Digital Humanities’, the DiRT directory, and the DHCommons project directory.

Motivations

TaDiRAH was developed in part as a response to the evolving needs of the DiRT directory, a longstanding, well-regarded source of information about available tools that support scholarship in the humanities. From its inception, DiRT has sought to engage a broad audience of tool users by limiting the use of jargon, and categorizing tools by the task(s) they perform, rather than using a more abstract taxonomy. A wiki format was originally chosen to ensure a low barrier to entry, providing a great deal of flexibility and allowing the site to develop quickly without a specific source of funding.

As the number of resources grew, the wiki platform became unwieldy. Consequently, DiRT was completely rebuilt in 2011 using Drupal, an open source content management system which provided more structure and enabled each tool to have a unique “profile” page. The platform supports options for browsing, sorting, and searching the entire directory across a variety of facets including tool category, cost, license, and developer. As of May 2014, the DiRT directory consists of approximately 800 tool listings, and receives approximately 3,000 unique visitors and 16-20,000 monthly pageviews. It has received funding from the Mellon Foundation for a new phase of technical development that includes the development of APIs to enable data exchange with DHCommons and Commons In A Box, a new feature for submitting tool reviews, and “recipes” that document how different tools can be combined to address research questions.

[pullquote]This project represents one of many data streams moving toward a networked integration of related hubs in the DH resource ecosystem.[/pullquote]

Early in 2013, members of the DiRT Steering Committee/Curatorial Board started looking at options for improving the site, which included an examination of the ways that the current taxonomy was being used by contributors. Following an analysis of the existing categories and free-form tags, we began a series of discussions with the DARIAH-DE team that created the Zotero bibliography (Christof Schöch, Matt Munson, Luise Borek). They had already begun work on a taxonomy of digital humanities activities. Recognizing our common goal, we formed a transatlantic collaboration around the task of developing a shared taxonomy. Based in Europe, DARIAH aims to enhance and support digitally-enabled research and teaching across the humanities and the arts. The DARIAH infrastructure will be a connected network of people, information, tools, and methodologies for investigating, exploring, and supporting work across the broad spectrum of the digital humanities. DARIAH-DE represents the German contribution to DARIAH.

How does it work?

Although the motivating factors behind the development of TaDiRAH are pragmatic, TaDiRAH and its antecedents are not without more theoretical and scholarly influences, including the concept of “scholarly primitives”[1. Unsworth, John. 2000. “Scholarly Primitives: What Methods Do Humanities Researchers Have in Common, and How Might Our Tools Reflect This?” London: King’s College London], DARIAH research into modeling the research process [2. See, for example: Benardou, Agiatis, Panos Constantopoulos, Costis Dallas, and Dimitris Gavrilis. “Understanding the Information Requirements of Arts and Humanities Scholarship.” International Journal of Digital Curation 5, no. 1 (June 22, 2010): 18–33. doi:10.2218/ijdc.v5i1.141.; Ruth Reiche, Rainer Becker, Michael Bender, Matthew Munson, Stefan Schmunk, Christof Schöch: “Verfahren der Digital Humanities in den Geistes- und Kulturwissenschaften” DARIAH-DE Working Papers Nr.4. Göttingen: DARIAH-DE, 2014. http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2014-4.pdf], and research on digital scholarly methods in the humanities.[3. See Borgman, Christine. Scholarship in the Digital Age : Information, Infrastructure, and the Internet. Cambridge: MIT Press, 2010; Gasteiner, Martin, and Peter Haber, eds. 2010. Digitale Arbeitstechniken für die Geistes- und Kulturwissenschaften. Vienna: UTB; and Siemens, Ray, John Unsworth, Susan Schreibman, eds. 2004. A Companion to Digital Humanities. Hardcover. Oxford: Blackwell] Unsworth’s “scholarly primitives” were developed with an eye towards practical applications: the “primitives” were functions of scholarship that could be embodied in tools, which could then be combined to achieve “higher order functions” (similar to DiRT’s “recipes”). Later work on articulating and organizing stages and aspects of research activity provides a more process-oriented approach to understanding scholarship. Both ways of breaking down scholarship into its constituent parts, and using those terms to categorize tools, can help a user understand how and when a given tool might apply to their research, and what other tools might complement it.

[pullquote]Two rounds of detailed, thoughtful feedback from the digital humanities community played a significant role in shaping the taxonomy.[/pullquote]

The taxonomy does not aim to be comprehensive, focusing instead on a subset of relatively broad categories that are widely used and generally understandable. It is expected to be most useful to projects seeking to collect, organize and provide access to information on digital humanities tools, methods, projects, or readings.

The current version of the taxonomy is based upon three primary sources:

  1. the arts-humanities.net taxonomy of DH projects, tools, centers, and other resources, especially as it has been expanded by digital.humanities@oxford in the UK and DRAPIer in Ireland;
  2. the categories and tags originally used by DiRT; and
  3. the DARIAH ‘Doing Digital Humanities’ Zotero bibliography of literature on all facets of DH.

These resources were studied and distilled into their essential parts, producing a simplified taxonomy of two levels: eight top-level goals that are broadly based on the steps of the scholarly research process, and a number of general methods under these goals that are typically used by scholars to achieve these research goals. Guided by the principle of separating research activities from research objects and the experience of managing earlier taxonomies, we created two additional open-ended lists for techniques and digital humanities research objects. Terms from either or both of these lists can be combined with any goal and/or method to further describe the activity. Two rounds of detailed, thoughtful feedback from the digital humanities community played a significant role in shaping the taxonomy, particularly the choice to treat techniques as a separate list, rather than forcing them awkwardly into a third level of the main taxonomy.

Acknowledging the impossibility of creating categories that would always be mutually exclusive, we aimed to create groupings that were distinct enough from one another to produce a level of consistency in application that would support interoperability and enhance discovery. We separated compound categories used by DARIAH (e.g. dissemination and storage), collapsed many of DiRT’s more granular categories (image editing and textual editing became: editing + an object), and added categories from both that were not easily mapped in either direction (e.g. designing and organizing). Decisions about what would be considered a “method”, and what would be treated as a “technique” were sometimes contentious. If more than one activity could be used to achieve the same ends then those activities were usually classed as techniques. Having an open list of techniques and objects will make it easier for TaDiRAH to keep up with a fast-changing field, as we anticipate those lists evolving far more quickly than goals or methods.

This project represents one of many data streams moving toward a networked integration of related hubs in the DH resource ecosystem. It will help to address the de-contextualization that is an unavoidable consequence of the move away from comprehensive sites that are difficult to sustain. TaDiRAH allows topically-restricted sites like DiRT (tools) and DHCommons (projects and collaborators) to focus on curating one particular kind of content, while still providing a way to identify and connect related information.

Future Steps

This summer, DiRT will undertake a comprehensive review of each tool entry. Terms from the TaDiRAH taxonomy will be added as part of this process. DHCommons staff will, similarly, add TaDiRAH terms to project profiles based on existing free-form metadata. Information from DiRT and DHCommons will be exposed using RDF, making this content available as linked open data, as well as through the APIs that are currently under development as part of the Mellon-funded integration initiative.

Applying TaDiRAH to actual directories will provide an opportunity to assess the degree to which it can accommodate real-world data. We anticipate revising TaDiRAH periodically in response to issues that arise during this process, as well as feedback from those who have used it in other ways (e.g. Micah Vandegrift, Scholarly Communications Librarian at Florida State University, made reference to TaDiRAH as a resource for introducing digital humanities to undergraduates, by using it as a guide to the roles within digital humanities projects).

DARIAH-EU has also committed to using this taxonomy as a basis for their development of a more complex ontology of digital scholarly methods, and we are also engaged in ongoing dialog with other ontology initiatives, including NeDiMAH’s work around scholarly methods. NeDiMAH (Network for Digital Methods in the Arts and Humanities), funded by the ESF (European Science Foundation), is a network of scholars involved in various aspects of the Digital Humanities across Europe, including understanding and classifying digital research practices. Our goal is to share at least high-level categories with NeDiMAH’s ontology, so that objects (projects, tools, articles, etc.) classified using our taxonomy can be automatically “mapped” to some level of the NeDiMAH ontology, and vice versa.

TaDiRAH (which is pronounced “ta-DEE-rah”, and is almost an anagram of “DARIAH” and “DiRT”) lives on Github at http://github.com/dhtaxonomy/TaDiRAH. We encourage readers to use TaDIRAH and submit feedback via the issue tracker on github. We currently only have a human-readable version available, but we’ll be publishing machine-readable versions (linked data, and a Drupal taxonomy feature module to make it easier for others to implement TaDiRAH on Drupal-based sites) in the near future.

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.

 

[wp_biographia user=”quinnd”]

[wp_biographia user=”perkintj”]

 

RESOURCE: National Digital Stewardship Alliance Glossary

Checksum? Bagger? Ingest?

The National Digital Stewardship Alliance has released a glossary of digital stewardship terms.

NDSA members have been working on a “Levels of Digital Preservation” activity to provide basic digital preservation guidance on how an organization should prioritize its resource allocation. This glossary provides a common language for NDSA members to communicate about the levels work and should also be useful as a general digital stewardship glossary.