Developing Research Tools via Voices from the Field 1


Whose voices are missing from the digital humanities (DH) and libraries discussions?

The users. Both DH and librarianship are inherently connected with users, yet user voices, especially those arising from empirical studies, are often missing from planning, developing, and implementing initiatives related to digital scholarship.

Humanists’ data management across the research lifecycle is a recent area of scholarly exploration that illustrates this problem. As Posner (2015) points out, digitization of humanists’ research sources has resulted in a situation where humanists “desperately need some help managing their stuff, and libraries are in a great position to help them.”[1. Posner, M. (2015). Humanities Data: A Necessary Contradiction. Available at; accessed June 26, 2016.] In order for this much needed collaboration to be successful, librarians and DH specialists need nuanced understanding of humanists’ research practices, that is, the insights arising from empirical studies of user behavior and needs.[pullquote]… librarians and DH specialists need nuanced understanding of humanists’ research practices, that is, the insights arising from empirical studies of user behavior and needs.[/pullquote]

In this essay we present selected findings from the Digital Scholarly Workflow project, funded by the Andrew W. Mellon Foundation and conducted at the Pennsylvania State University from 2012-2016. One of the core project activities was an ethnographic study of the scholarly workflow among the Penn State faculty, aimed at bringing user voices to the center of research and software development.

In the first project phase, (2012-2014), we explored the scholarly workflow of the Penn State faculty across disciplines, including sciences, social sciences, and humanities. This comparative perspective enabled us to identify specificities of humanists’ workflow, as well as the distinctive features of a software architecture that supports humanities users’ workflows. Through a web survey (N=196) and in-depth interviews (N=23) we studied how scholars engage with digital research tools and resources in different phases of their research process (please see Figure 1), as well as their attitudes and needs concerning digital research tools.


In the second project phase (2014-2016), we conducted a set of contextual inquiry sessions, observing humanists’ research process and use of digital research tools in situ (N=14). We then partnered with the Zotero citation management software development team from George Mason University to develop software architecture that supports humanists’ practices, thus enhancing the Zotero software based on our research findings.

In the following, we first summarize the results of our empirical user studies, and we then turn to describing how these studies and user voices informed the development of the Zotero software architecture.


The results of our user studies showed that digital tools and resources have different roles and levels of integration at various phases of the scholarly workflow. The perceived and/or actual influence of these tools thus differed across individual segments of the workflow, as well as across academic disciplines. In this essay we focus on the results related to three research activities that were of particular relevance for the second phase of our project– finding, citing, and archiving research data and materials; for a complete account of the study results across all phases of the workflow see Antonijević and Stern Cahoy, 2014, and Antonijević, 2015.[2. Antonijević, S. and Cahoy, E. (2014). “Personal Library Curation: An Ethnographic Study of Scholars’ Information Practices.” Portal: Libraries and the Academy, Vol. 14,  No 2. Antonijević, S. (2015). Amongst Digital Humanists: An ethnographic study of digital knowledge production. London, New York: Palgrave Macmillan.]

Finding research data and materials

Finding and accessing research data and materials electronically is a daily practice of our study participants, regardless of their disciplinary background and/or level of technical proficiency. Among the humanities scholars, electronic access to research materials represented one of the key transformations in research practice. An associate professor of French explained, for instance, that for her “online bibliographies have been the major, major, major tool that has completely changed the possibilities for research projects.”

Across disciplines, the path towards finding information online commonly starts with Google Search and Google Scholar, especially for scholars engaged in discovery search. Library databases are more typical access points for humanities scholars engaged in confirmation search, that is, a known item search. Among the humanists, library databases are also a primary point for accessing journal publications, whereas monographs are predominantly accessed and managed in the print form.

[pullquote]…for a significant number of our humanities respondents, the main sites for finding and accessing primary research materials are physical archives and their print holdings.[/pullquote]The importance of paper-based materials in humanists’ research practices comes to the forefront when we consider primary research materials. Namely, for a significant number of our humanities respondents, the main sites for finding and accessing primary research materials are physical archives and their print holdings. This has important consequences for humanists’ research workflow, including the activities of citing and archiving, as we explain further in the text.

Citing research data and materials

Despite the apparent benefits of automating the citation process, our study revealed a relatively low use of citation managers across disciplines. Humanities scholars reported dissatisfaction with the existing citation managers as a common reason for bypassing those tools, explaining that even though they took classes or tested citation managers such as EndNote, Zotero, or Mendeley, “all of those had issues” that made it easier for them to continue manually managing citations.

Another important reason for low uptake of citation managers stems from the fact that those tools are tailored towards scholarly publications, and not towards archival materials such as letters, maps, photos or diaries that humanists often use in their work. For instance, a professor of history related that none of the citation managers enabled him to store, annotate, and cite archival materials, which made those tools unproductive in his work.  “Engineers think the tools should be efficient, while they should be sufficient,” he explained, adding that his current workflow and use of technology were sufficient for his research needs.

Yet, most of the humanists’ believed that, with some improvements, citation managers would be beneficial for their work. A step towards such an improvement lies in the example of Tropy, a digital tool currently under development by the Zotero team and the Roy Rosenzweig Center for History and New Media, aimed at facilitating researchers’ management of archival materials.[3. Roy Rosenzweig Center for History and New Media (2015). RRCHNM to build software to help researchers organize digital photographs. Available at:; accessed June 30, 2016.]

Archiving research data and materials

The majority of scholars consulted in this study reported actively storing, archiving, and backing up research materials important to them, but actual preservation practices varied along disciplinary lines.  Storing information refers to the act of saving a document or other material for later use.  For instance, scholars in the humanities and social sciences frequently stored Word documents, while their colleagues in the sciences stored data files.  Archiving information, while similar to storing information, is a more finite activity, where the scholar saves a final, stable version of the document for use in the future.  An example of archiving in practice is illustrated by the scholar saving a published version of an article in their institutional repository.  Backing up research materials is the practice of creating another copy of important research materials, so that if a primary version is lost, the data remains.  All of these activities are similar in that they are inexorably linked with information preservation, and in that respect shed light on a scholar’s commitment to saving information for use later.

The results also showed that the majority of humanists considered archiving a vital element of their workflow, yet they suffered loss of research materials. Some of them did not archive and backup research materials at all, commonly citing the lack of skill, habit or both: “It’s insane. I know it is [dangerous]. I don’t have any external storage devices because I don’t know how to use them, and I’m just absolutely ignorant about those [cloud-based services],” said an assistant professor of French and Linguistics.

Yet, such complete absence of preservation activities was rare among our participants. More frequently scholars’ archiving practices were hampered through a set of challenges corresponding to those identified in Marshall’s (2007) study.[4. Marshall, C.C. (2007). “How People Manage Personal Information over a Lifetime.” In: Jones and Teevan (eds.), Personal Information Management. Seattle, Washington: University of Washington Press: pp. 57-75. Available at; accessed November 18, 2014.] For instance, a significant number of our interviewees reported having inaccessible files, most commonly as a result of not migrating to new data formats, which suggested a need for promoting personal archiving literacy among scholars (see: Zastrow, 2014), as well as for developing self-archiving strategies integrated into scholars’ research workflows.[5. Zastrow, J. (2014). “PIM 101: Personal Information Management.” Computers & Libraries, Vol. 34, No. 2. Available at–PIM-101–Personal-Information-Management.shtml; accessed July 18, 2015.]

[pullquote]…even with electronic publications humanists had breakdowns between the stages of discovery and organization/storage, due to persistence of print-based organizational and archiving practices.[/pullquote]Our findings also showed that electronic publications were easier to store, while archival materials–which humanists consulted in our study usually stored as photo files– required more effort, and were more difficult to archive, search, and analytically manipulate. Yet, our findings indicated that even with electronic publications humanists had breakdowns between the stages of discovery and organization/storage, due to persistence of print-based organizational and archiving practices. These findings suggested that both discovery and self-archiving strategies must be integrated into other stages of the humanists’ scholarly workflow, which was one of the main tasks in the second phase of our project, which focused on enhancing Zotero citation manager.

Enhancing Zotero

[pullquote]…through user-focused and DH-centered research, the voices of the humanities scholars can come center stage and direct need-based software development with impact.[/pullquote]From 2014-2016, our research team partnered with George Mason University to develop enhancements for Zotero software that link to Hydra-based institutional repository services, and that allowed discovery of new articles to happen within the Zotero interface. Further software refinement focused on integrating additional scholarly activities with the existing workflow of citation management software.

The new Zotero/Hydra enhancement allows scholars to archive their materials throughout the research process, embedding self-archiving into the workflow. We recently tested new enhancements to the Zotero interface, including native feed support to enable easy discovery, management, and import of relevant scholarly publications from within Zotero. The Zotero enhancements are currently in public beta testing and will be formally available with the formal release of Zotero 5.0 later in 2016. We conducted preliminary user testing of these enhancements to assess the utility of the optimizations. Users responded very positively to the enhancements, particularly to the ability to archive self-authored works on the Zotero website. Users were also very enthusiastic about the possibility of finding new research articles from within the Zotero interface, although there was a high learning barrier to optimal use of RSS feeds for discovery.

The research lifecycle remains at the center of all digital humanities efforts. Large projects come to life through the efforts of individual digital humanists. Yet, if the scholarly workflow of a digital humanist is not optimized, work is stymied, information is lost, and efforts are not brought to full fruition. Enabling faculty to easily and readily find, store, cite, and manage digital resources is mission critical for libraries and IT professionals. Our study illustrated one of the “ways in which libraries could make meaningful interventions in the humanities research lifecycle” (Posner, 2015), showing that through user-focused and DH-centered research, the voices of the humanities scholars can come center stage and direct need-based software development with impact. These Zotero optimizations came directly from user research findings, which showed that discovery and self-archiving, along with annotation and organization, remain problematic for humanities users to manage. Optimizing the digital humanities requires a focus on the digital humanist as an individual; once the individual’s work is facilitated, the pathway is set for greater and broader contributions. For libraries and IT, this is an essential point of convergence: centering on the user’s workflow as a roadmap for developing services and technologies that facilitate all phases of digital humanities research.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution 4.0 International License.

About the authors

Smiljana Antonijević Ubois, PhD, explores the intersection of communication, culture, and technology through research and teaching in the U.S. and Europe. She is currently engaged as a research anthropologist at Penn State University. Smiljana's recent publications include Amongst Digital Humanists: An Ethnographic Study of Digital Knowledge Production (Palgrave Macmillan, 2015), “Personal Library Curation” (The John Hopkins University Press, 2014), and “Working in Virtual Knowledge” (MIT Press, 2013). Her latest research projects are Digital Scholarly Workflow, Penn State University; Alfalab: eHumanities Tools and Resources, Royal Netherlands Academy of Arts and Sciences (KNAW); and Humanities Information Practices, a collaboration of the KNAW, Oxford Internet Institute, and University College London. For more information see

Ellysa Stern Cahoy is an Education Librarian and an Assistant Director of the Pennsylvania Center for the Book in the Penn State University Libraries, University Park. A former children’s librarian and school library media specialist, Ms. Cahoy has published research and presented on information literacy, evidence-based librarianship, library instruction, and personal archiving. In 2014, she was awarded a $440,000 grant from the Andrew W. Mellon Foundation to fund the further exploration of faculty’s personal scholarly archiving practices and needs (building upon the work of a 2012 grant). In 2013, Ellysa received the Miriam Dudley Instruction Librarian Award from the Association of College & Research Libraries (ACRL) Instruction Section. She is currently Chair of the ACRL Instruction Section.

One comment on “Developing Research Tools via Voices from the Field

  1. Pingback: Editors’ Choice: Developing Research Tools via Voices from the Field ← dh+lib

Comments are closed.