RESOURCE: Sifting the Digital Heap: A Scoping Study of AI for Government Archives

Lise Jaillant, Matthew Kidd, and Lingjia Zhao (all Loughborough University) have shared their report, “Sifting the Digital Heap: A scoping study of AI for government archives – access, backlogs, and responsible practice” via Zenodo. Their report details the GLOW scoping study, part of the overall LUSTRE project, which has the overall aim to “connect policy makers with Computer Scientists, Digital Humanists and professionals in the GLAM sector (Galleries, Libraries, Archives and Museums).” From the paper abstract:

AI can play a decisive role in making digital government records more accessible and manageable, provided that its use is grounded in responsibility and clear purpose. Work is already underway across archives and government, where AI is being used to manage scale, improve accuracy, and enhance public access to digital records – including email, PDFs, spreadsheets, images, scanned documents, audiovisual assets, and social media posts. Building on these foundations, the GLOW study identifies four interlinked priorities for responsible and effective adoption.

First, AI should be applied where it can deliver clear, measurable benefits – particularly in appraisal and selection, sensitivity review, and metadata enrichment for both textual and audiovisual materials. Interviewees repeatedly stressed that the sheer volume of born-digital records now exceeds human capacity, making manual processing unrealistic. As John Sheridan (The National Archives UK) described, effective archival AI workflows will resemble a series of sieves: simple tools handling early filtering (e.g., filetype detection, basic entity extraction), followed by increasingly sophisticated machine learning and language model techniques that surface higher value material for expert review. Used in this layered way, automation can strengthen efficiency and consistency without displacing professional judgement. For example, AI can flag personal or confidential information, identify clusters of potentially significant correspondence, or reveal hidden risks within complex files – reducing the time specialists spend on mechanical triage and enabling them to focus on interpretive and high-stakes decision making.

Second, ChatGPT and other generative AI tools have modified users’ expectations. Users increasingly expect instant, conversational access to archival information; AI‑generated summaries rather than raw documents; cross‑collection synthesis; intelligent handling of poor metadata; and personalised research support – expectations that exceed the capabilities of many archival systems. To adapt to these changing expectations, archival institutions are experimenting with new techniques and protocols such as MCP (Model Context Protocol) and RAG (Retrieval-Augmented Generation). Used together, MCP and RAG align GenAI discovery with archival values of provenance, authenticity, accountability, and user trust.

Third, progress towards responsible AI depends on implementing a clear and accountable framework. Automation must operate within systems that guarantee transparency, traceability, and security, supported by training and governance that ensure ethical use. Many institutions are already developing policies and guidance, but a unified framework would help align practice across departments and institutions, and safeguard the integrity of public records. A clear and accountable framework for responsible AI should answer questions of purpose, transparency, human oversight, risk/security, and accountability/auditability.

Fourth, the study calls for a coordinated national strategy that connects these efforts. The National Archives (UK), working with ministerial departments and other administrations, is well placed to lead this work in partnership with the wider GLAM and academic sectors. International collaboration with bodies such as the National Archives and Records Administration (NARA) in the United States and the European Archives Group could extend these principles globally. Through this joined-up approach, which would integrate technology, policy, and human expertise, AI can strengthen public trust and ensure that the digital record remains secure and accessible.

dh+lib Review

This post was produced through a cooperation between Mimosa Shah, Claire Burns, Carrie Pirmann, Melissa Horak-Hern, and Taylor Faires (Editors-at-Large), Caitlin Christian-Lamb and Rachel Starry (Editors for the week), Claudia Berger, Ruth Carpenter, Nickoal Eichmann-Kalwara, Linsey Ford, Pamella Lach, Molly McGuire, Hillary Richardson, and Christine Salek (dh+lib Review Editors), and Tom Lee (Technical Editor).

Leave a Reply

  

  

  

This site uses Akismet to reduce spam. Learn how your comment data is processed.