WADL 2022 Homepage

    Web Archiving and Digital Libraries -- a Virtual Workshop in conjunction with JCDL 2022

    Date: June 24, 2022

    We welcome broad attendance; please contact the co-chairs for any questions you may have.

    Please see the approved WADL 2022 workshop description from the JCDL proceedings as well as the workshop page hosted by the conference.

    Please also refer to past WADL homepages: 2020, 2019, 2018, 2017, and 2016. Past workshop proceedings can be found from: WADL 2017-19, Pre 2016.
    Prior workshops have led in part to a special issue of the International Journal on Digital Libraries.


    Registration

    Since WADL is hosted by JCDL, at least one author per paper must register at least for the workshop at:
    the JCDL registration page.
    JCDL 2022 is a hybrid event:
    you can participate on-site for the conference or remotely or just for the workshop (see Satellite Events).

    Schedule (using EDT)

    == Opening Session (Moderator Martin Klein) ==
    9 Welcome, Introductions, Tech Ironing (everyone speaks!)
    
    == Talks 1 (Moderator Martin Klein)  ==
    9:30am Invited Talk 1: Karolina Holub (see below for details)
    10:10 Discussion
    
    == Talks 2 (Moderator Martin Klein)  ==
    10:30 Where are the Datasets? A case study on the German Academic Web
    Archive
    Yousef Younes, Sebastian Tiesler, Robert Jäschke and Brigitte Mathiak
    
    10:45 Comparison of Access Patterns of Robots and Humans in Web
    Archives
    Himarsha Jayanetti, Kritika Garg, Sawood Alam, Michael Nelson and
    Michele Weigle
    
    10:50 Wayback Machine Video Archiving Insights
    Sawood Alam, Bill O'Connor and Mark Graham
    
    10:55 Optimizing Archival Replay by Eliminating Unnecessary Traffic to
    Web Archives
    Kritika Garg, Himarsha Jayanetti, Sawood Alam, Michele Weigle and
    Michael Nelson
    
    11:00 Discussion
    
    == Talks 3 (Moderator Zhiwu Xie) ==
    11:20 Emulation-based long-term Access to Complex Web-sites
    Marcel Tschöpe, Rafael Gieschke and Klaus Rechert
    
    11:35 Web Archiving as Entertainment
    Travis Reid, Michael Nelson and Michele Weigle
    
    11:40 First steps in Identifying Academic Migration using Memento and
    Quasi-Canonicalization
    Mat Kelly, Deanna Zarrillo, Christopher Jackson and Erjia Yan
    
    11:45 Discussion
    
    12:05 (Lunch) Break
    
    == Talks 4 (Moderator Mat Kelly) ==
    13:00 Invited Talk 2: Carrie Pirmann and Erica Peaslee (see below for
    details)
    13:40 Discussion
    
    == Talks 5 (Moderator Mat Kelly) ==
    14:00 CDX Summary for Web Archival Collection Insights
    Sawood Alam and Mark Graham
    
    14:15 Russia-Ukraine News on the Dark Web
    Grant Atkins, Aaron Buehne, Abby Mabe, Zak Zebrowski and Justin
    Brunelle
    
    14:20 Archiving Source Code in Scholarly Content: One in Five Articles
    References GitHub
    Emily Escamilla, Talya Cooper, Vicky Rampin, Martin Klein, Michele
    Weigle and Michael Nelson
    
    14:25 Discussion
    
    == Talks 6 (Moderator Ed Fox)  ==
    14:45 15m Arch-It
    Helge Holzmann, Nick Ruest, Jefferson Bailey, Alex Dempsey, Samantha
    Fritz, Ian Milligan and Kody Willis
    
    15:00 WACZ
    Ed Summers, Ilya Kreymer and Cade Diehm
    
    15:05 Moving the End of Term Web Archive to the Cloud to Encourage
    Research Use and Reuse
    Mark Phillips and Sawood Alam
    
    15:20 Discussion
    
    Closing Session (Moderator Ed Fox)
    15:40 Closing Discussion (publication and other collaboration
    opportunities, next event planning)
    16:30 end
    

    Invited Talks

    1. Karolina Holub, Library Adviser, Croatian Digital Library Development Centre, Croatian Institute for Librarianship
    Title: A history of web archiving at the National and University Library in Zagreb
    Abstract: The National and University Library in Zagreb (NSK), as a memory institution responsible for collecting all types of resources, early recognized the significance of collecting and preserving web resources as part of its core activities. In 2004, the NSK developed, in collaboration with the University of Zagreb University Computing Centre (Srce), the Croatian Web Archive (HAW). The NSK is using three different approaches and tools to archive the Croatian web. At the beginning, only selective archiving of web resources was conducted. In order to build a more comprehensive national collection, crawls of the whole national domain (.hr), thematic, and event crawls followed a few years later. This talk will present the chronology of working processes and diverse ways the NSK attempts to preserve Croatian web as a contemporary part of the cultural and scientific heritage.
    Bio: Karolina Holub is a coordinator of the Croatian Digital Library Development Centre at the Croatian Institute for Librarianship in the National and University Library in Zagreb. Her field of work includes developing, implementing and maintaining digital library systems (Croatian Web Archive, Digital Collections of the National and University Library in Zagreb, Croatian electronic theses and dissertations repositories etc.) as well as taking care of metadata harmonization and interoperability with other systems for all types of resources. She is involved in managing and participating in the development of the Library’s digitization projects and thematic portals, and is involved in several national and international projects.

    2. Carrie Pirmann (Bucknell University) and Erica Peaslee (Centurion Solutions LLC)
    Title: Building a Community of Web Archivers: The Race to Save Ukrainian Cultural Heritage Online
    Abstract: In response to Russia’s invasion of Ukraine on 24 February 2022, over 1300 cultural heritage professionals—librarians, archivists, researchers, programmers came together to archive the web presence of Ukraine’s cultural heritage. In the proceeding 4 months, SUCHO (Saving Ukrainian Cultural Heritage Online) has digitally preserved over 40 TB of websites, databases, and other digitized cultural property to hold in trust for Ukrainian colleagues while they are working to preserve their heritage on the ground. This talk will cover the basics of coming together in a distributed grassroots response, the evolution to collaborating with heritage responders and using open-source information to guide efforts, implementing a workflow across 14+ timezones, and utilizing the Webrecorder suite of tools developed by Ilya Kreymer. We hope that the processes and lessons learned from this path-breaking project can be used to assist with responses to similar archiving emergencies and help institutions preemptively establish similar methods for future use.
    Bio: Carrie Pirmann is the Social Sciences Librarian at Bucknell University (USA), working at the intersections of information literacy instruction, research support, and digital scholarship in the social sciences. She holds a master’s degree in library science from the University of Illinois, and has put her years of experience as a librarian to use for SUCHO by conducting extensive research to locate cultural heritage sites online that need to be archived, and working the Situation Monitoring team to keep abreast of situations in critical areas of Ukraine.
    Bio: Erica Peaslee is the Administrative Operations Coordinator at Centurion Solutions LLC, a Disaster and Emergency Management consultancy in Texas (USA) where she also provides subject matter expertise regarding cultural heritage. Using her background in museum collections and her graduate education in Museum Studies (Harvard), she is particularly interested in centering cultural property in emergency planning and resilience, and promoting communication between the two communities. Erica currently serves as Situation Monitoring Coordinator for SUCHO, leading the observation and coordination of using real-time information from Ukraine to direct efforts to the most at-risk areas. In addition, she also works with other professionals at the intersection of cultural heritage, crime, and emergency response to coordinate and facilitate working towards similar goals.

    Submissions:

    • Paper length: 3-5 pages for a 15 minute presentation
    • Paper length: 1 page for 5 minute lightning talk

    • Due date: May 1, 2022 AoE - CLOSED
    • Notifications: mid-May

    • Submit to: Easychair submission system
    • Please use the ACM Proceedings template

    Description:

    Due to the current state of the world, WADL 2022 will be held entirely online.

    Please note though that JCDL 2022 is currently planned as a hybrid event and we encourage all WADL attendees to also register and attend JCDL.

    WADL 2022 will continue the WADL tradition to provide a forum and collaboration platform for international leaders from academia, industry, and government to discuss challenges, and share insights, in designing and implementing concepts, tools, and standards in the realm of web archiving. Together, we will explore the integration of web archiving and digital libraries, over the complete digital resource life cycle: creation/authoring, uploading, publishing on the web, crawling/collecting, compressing, formatting, storing, preserving, analyzing, indexing, supporting access, etc.

    WADL 2022 will cover all topics of interest and specifically invite contributions from practitioners. Topics include but are not limited to:

    • Event archiving and collection building
    • National and international perspectives on web archiving
    • Social media archiving
    • Community building
    • Ethics in web archiving
    • Archival metadata, description, classification
    • Archival standards, protocols, systems, tools
    • Crawling of dynamic, online art, and mobile content
    • Discovery of archived resources
    • Diversity in web archives
    • Extraction and analysis of archival records
    • Interoperability of web archiving systems

    Objectives:

    • Continue to build the diverse community of people integrating web archiving with digital libraries
    • Help attendees learn about useful methods, systems, tools, and software in this area
    • Help chart future research and practice in this area, to enable more and higher quality web archiving
    • Promote synergistic efforts including collaborative projects and proposals
    • Produce an archival publication that will help advance technology and practice

    Workshop Co-chairs:

    • Chair: Martin Klein, Scientist, Los Alamos National Laboratory Research Library, mklein@lanl.gov
    • Co-chair: Mat Kelly, Assistant Professor, Drexel University, College of Computing and Informatics, mkelly@drexel.edu
    • Co-chair: Zhiwu Xie, Professor & Chief Strategy Officer, Virginia Tech Libraries, zhiwuxie@vt.edu
    • Co-chair: Edward A. Fox, Professor and Director Digital Library Research Laboratory, Virginia Tech, fox@vt.edu, http://fox.cs.vt.edu

    Program Committee:

    • Brunelle, Justin F., The MITRE Corporation, jbrunelle@mitre.org
    • Duncan, Sumitra, Frick Art Reference Library, duncan@frick.org
    • Finnell, Joshua, Colgate University, jfinnell@colgate.edu
    • Goethals, Andrea, National Library of New Zealand, Andrea.Goethals@dia.govt.nz
    • Jones, Shawn, Los Alamos National Laboratory, smjones@lanl.gov
    • Ko, Lauren, UNT Libraries, lauren.ko@unt.edu
    • Lyon, Meghan, Library of Congress, mlyon@loc.gov
    • McCown, Frank, Harding University, fmccown@harding.edu
    • Nelson, Michael, Old Dominion University, mln@cs.odu.edu
    • Nwala, Alexander, Indiana University, alexandernwala@gmail.com
    • Rampin, Vicky, New York University, vicky.rampin@nyu.edu
    • Risse, Thomas, University Frankfurt, University Library J. C. Senckenber, t.risse@ub.uni-frankfurt.de
    • Taylor, Nicholas, Los Alamos National Laboratory, ntay@lanl.gov
    • Weber, Matthew, Rutgers University, matthew.weber@rutgers.edu
    • Weigle, Michele, Old Dominion University, mweigle@cs.odu.edu
    • Wrubel, Laura, Stanford University, lwrubel@stanford.edu