Given the proliferation of the word, digital, amongst library communities it would be wise to discuss what the term, digital, represents. From an engineering perspective, digital technology, as its name implies, allows information to transfer via digits. These digits may take the form of ones and zeros that make up binary code, numbers that represent colors on our screens, or flashes of light in fiber optic cables. Regardless of their medium, the use of digits and the math that comes with them means that communications are precise. Before digital technology we came to accept only approximations of source material when making transfers—sources are copied—such as hissing audio on analog cassette tapes. However, now, with the digital, we can assume that transferred assets can be exactly like their original—sources are duplicated—when they are moved from one place to another in digital form. At the same time digital duplicates may become differentiated from the original and/or each other through changes in format or resolution, making them substantially different from the original source, or, via manipulation software, they may include dramatic alterations often impossible with physical materials.
What does it mean to have the same exact object spread across various devices, screens, and contexts?
Before digital duplication, the age of mechanical reproduction has, of course, existed for centuries. Even after so much time contending with the analog form, there is still much effort applied to the management of copies. Consider a print book, the creation of which often requires a painstaking process of curating individual, print-quality images and the permissions that come with them. Sometimes, after great lengths have been made to clear the rights to an image, it can’t be used at all due to copyrights set by far-reaching legislation such as the Sonny Bono Copyright Term Extension Act (CTEA), which extends copyright terms past the human lifespan. Due in part to the CTEA, publishers now have rigid protocols for dealing with media for their print books. Likewise, libraries have adapted with protocols of their own for archiving and making materials available to their academic audiences, as well as training faculty, students, and staff how to use and reuse content in ethical ways. These adaptations have taken time, often decades, to introduce new practices to existing workflows. Decades-long timeframes may be acceptable within print contexts where even copying takes a measurable amount of time. With the onset of the digital, though, it is essential to quickly grasp new protocols, particularly their duplicative properties, to keep pace with the digital’s rapid and exponential growth.
Turning our attention to digital technology, media in digital formats can be copied via analog means such as printing with an inkjet printer, but it can also be duplicated exactly across digital devices instantaneously with the click of a button. Consider JPEG images on a few separate computers, all created from the same source photograph. They might have been processed by tools such as Photoshop to have different sizes or resolutions, their colors changed, or made into something different. If we are only now adapting workflows to capture the legal implications of producing and storing copies that have noticeable differences, where are our tools to grapple with the many situations that media find themselves online? At the same time, these JPEGs might have no discernible difference at all, unchanged by human or machine before being duplicated across vast digital distances. What does it mean to have the same exact object spread across various devices, screens, and contexts? World Wide Web originator Tim Berners-Lee warned that the web was filling up with digital material void of context and meaning. Now this problem is compounded by having thousands of duplicates, sourced from untouched originals or inexact copies, each with little or no provenance, propagating on web servers across the Internet.
The digital brings with it not just new mediums but also important new considerations about source material, metadata, and distribution.
In 2008, Catherine Marshall, then of Microsoft Research, published a paper in D-Lib Magazine consolidating a number of studies on so-called “personal digital archives” that define how Internet-connected people interact with their digital files kept on personal and cloud computers. A primary concern for Marshall is the common practice of keeping one’s files, “distributed among different stores for a variety of reasons,” (emphasis hers) such as for creating backups, sharing with friends, or, importantly, “to use online files locally.” The last point is particularly interesting to Marshall due to its opacity to the user; most people are not aware, or do not care, that duplicates are being created on one’s local computer, in My Document folders and elsewhere, each time they interact with online material. Compound these duplicative actions with intentional edits, such as a user resizing an image in Photoshop, and a single digital file may spawn a “dozen versions of a photo she liked, each subtly different from the last.” Whether creating a media-rich document in MS-Word, writing a post with images in WordPress, or, in cases common to librarians, creating assets of different resolutions in Photoshop or establishing a media node in CONTENTdm, chances are that the way to accomplish the task is to download a digital file to one’s local computer, then later upload the local file into a new place online.
Particularly when cultural institutions, such as libraries, are striving to enrich digital material with metadata and provenance records, new tools should seek out existing objects from trusted archives and repositories rather than fall back on more troublesome practices, such as creating yet another version of the same resource. In the event that new assets must be created, these same tools can make easier the insertion of provenance and versioning metadata through techniques common in contemporary user interface design. One such tool is Scalar, a digital publishing platform that discourages uploading media directly into Scalar “books” while offering technical linkages directly to partner archives. Scalar’s database is based on the Resource Description Framework (RDF), the standard transfer format of Berners-Lee’s Semantic Web, which allows Scalar and other “Semantic Web systems” to read and apply metadata to assets across the Internet including concepts such as versioning history and annotations. Unfortunately, even though software plugins for working with RDF and related technologies are readily available, the Semantic Web is not the first thing that comes to mind when people think of library tools.
The digital brings with it not just new mediums but also important new considerations about source material, metadata, and distribution. Therefore, as we attach the term, digital, to our various practices consider the revealing new descriptions that are produced:
- “Digital scholarship” assumes that sources are abundant, so the focus can be on connecting interpretations and insights with both online and offline materials.
- “Digital preservation” contends with new protocols for maintaining cultural assets that might have a single source or many sources, and many versions. It also establishes provenance, determines values of sources, copies, and duplicates, and migrates materials when necessary to ensure longevity and access.
Put another way, how do meanings shift when the term, digital, is conjoined with other words like “scholarship,” “libraries,” and “preservation”? First, consider the nature of the “digital humanities” (DH). Work in DH takes advantage of digitized artifacts and interacts with them in ways that are often impossible or ill-advised with their physical originals. Digital archive, analytical, and annotation tools allow scholars to add content directly to digital files, make duplicates for long term storage, create versions at different resolutions, extract specific information, and combine it with other digital assets without altering the progenitors. Much of this work would be unmanageable or inefficient without the affordances of digital technology. What is more, DH also entails studying and critiquing born-digital materials that are not a copy of anything physical but may themselves be remixes of existing data or duplicated assets. The product of each of these research processes, then, may be digital itself and/or a physical manifestation of scholarship. But unlike “traditional” scholarship that generally reads similarly whether published digitally or in print, digital scholarship creates and conveys through rich, layered, linked, and interactive engagements that are only possible in the digital realm.
Digital scholarship encompasses both products and the processes used to create them, as well as methods of preservation, curation, and consideration of access and intellectual property rights that are complicated by actions such as copying, duplicating, and remixing.
The impact of the digital on librarianship is more significant than a semantic analysis may imply. As more and more scholarship may reasonably be prepended the term, digital, inside and outside the humanities, the digital touches every aspect of librarianship: from collection development, curation, exhibition, and preservation to user services, reference, and instruction. Digital librarianship, therefore, is librarianship that concerns itself with enabling and empowering faculty, students, and staff to discover, engage with, create, and preserve quality content whose properties extend beyond mechanical reproduction into areas that include duplication, manipulation, and remix.