Wednesday, 4 April 2012

Forensic workstation pt 4

Earlier parts of this series had touched on identifying our needs and requirements from a workstation (see part 1), re-purposing an old PC into our first workstation (see part 2) and our early experiences of write-blockers and FTK Imager (see part 3)

Our experiments with our write-blockers have been limited, but each time we get them out of their boxes they seem a little less scary. The recent deposit of some born-digital audio and video material totalling over 200GB has thrown-up a number of new issues for us to consider.

The files came to us on an external hard drive formatted for a Mac that we needed to return to the depositor at which point we knew the files would be deleted - placing greater emphasis on the need to get the capture process correct as we wouldn't be able to return to the depositor and try again!

We were unable to browse the files in Windows Explorer, but were able to see the files using FTK Imager and our USB write-blocker. The sheer size of the files is something we are going to have to get used to with a 45 minute QuickTime film is 10.1GB and a 43 minute wav file is 671MB.

The workstation already has PaintShop Photo Pro for viewing and converting image files but only the standard viewers for audio and video content. So we started to look for open source software for viewing and converting the audio and film files, I wanted something that had a graphical and not a command line interface, as I was keen for other staff to develop skills and experience in handling this type of content.

As with our earlier use of tools like Karen's Directory Printer and DROID once we have become familiar with software we then document our use by creating a simple 'Idiots Guide' - this allows us to record both issues and solutions that we have encountered.

A bit of browsing and a few recommendations later and we have now installed Audacity v2 and WinFF but we will also take a look at others including Handbrake and FFmpeg before making a final decision.

We are keenly awaiting the forthcoming release of the DPC Technology Watch Report on 'Preserving Moving Pictures and Sound' and revisiting the FutureArch blog entries on media formats.

Thursday, 22 March 2012

UKAD Archives Discovery Forum 2012


A few reflections on the UKAD Archives Discovery Forum 2012 that I attended yesterday

The day started with a really interesting keynote piece from Bill Thompson (BBC) about his role in using the archives to forge partnerships with a range of organisations and he highlighted the range and diversity by talking about several projects including a project this summer with the Arts Council, the centenary of the First World War and an exciting initiative called Digital Public Space (diagram on collectionslink website)

Joy Palmer (MIMAS) gave a talk about the JISC Discovery programme and the ongoing work to demystify aspects like APIs, persistent URIs, user interfaces and measuring both impact and value.

Teresa Doherty (The Women's Library) then spoke about name authority records and how you can help make your collections more discoverable by adding links to the archives from relevant biographical pages on Wikipedia - something we started to do at Hull several years ago but this was a useful reminder to revisit this simple approach that can have a huge impact on the visibility of your collections.

After a great networking opportunity called lunch Lindsay Ould (Kings College London) talked us through the JISC funded FIDO project (Forensic Information in Digital Objects) and their experiences, highlighting a range of technical, skills-based and ethical issues and also their use of OS Forensics software.

I then gave a presentation about born-digital archives, but took a different approach - instead of focussing on the work we have undertaken at Hull I presented a very brief SWOT analysis to highlight many of the issues we have experienced.

There was then a series of short presentations including Sam Velumyl (The National Archives) who gave an overview of the TNA Finding Archives project , Teresa Dixon (West Yorkshire Archives Service) spoke about the History to Herstory website which features over 80,000 images including the Amy Johnson letters held at the Hull History Centre.

Kimberly Kowal (British Library) spoke about a crowd-sourcing map project which saw 725 maps geo-referenced in a week (see a blog entry about this project) and Alison Cullingford spoke about the Research Libraries UK Unique and Distinctive Collections project.

The final session was from Bill Stockting (British Library) about the completion of the Integrating Archives and Manuscripts System - bringing a vast number of legacy data sources and systems and over 1.5m records together and this now sits behind the http://searcharchives.bl.uk/ site.

Despite all of this there were a host of other sessions I would like to have attended including linked data, the National Archives new catalogue and the Old Maps online project.

Update 2nd April - slides from the sessions have been added to The National Archives website, see the Documenting Collections page

Monday, 19 March 2012

Archives and Society

Two weeks ago I spoke at the Archives and Society series of tasks held at the Institute of Historical Research about the progress and work at Hull as a result of the AIMS project. Whilst highlighting the AIMS White Paper the bulk of the talk was about the practical steps we had taken at Hull with born-digital archives; starting with a simple survey of collections and then followed by photography of media and creating a forensic workstation (a tale told in multiple parts see - part 1, part 2 and part 3).

I sought to encourage those present to download software like Karen's Directory Printer and DROID and to have a go - using a few test files will help increase your familiarity with many of the issues associated with digital preservation.

I managed to stop in time for questions - and these included aspects relating to the fact that the issues I raised were not "new" and whether we would still be making the same case in 5 years time (I hope not) and the need for automated tools to help us cope with the sheer volume of material (an obvious need) and the associated risk of releasing material that you haven't explicitly checked because of the sheer volume of files..

A PDF version of the slides is available - the talk was also recorded and I will add a link to the podcast when it is available.

Friday, 20 January 2012

AIMS White Paper now available


After a huge amount of effort the AIMS White Paper has finally been finished and is now available online.

The White Paper is intended as a framework for guide good practice in terms of archival tasks and objectives necessary for success. It builds upon the experiences of the four partner institutions - the universities of Hull, Stanford, Virginia and Yale - to process a range of collections with an array of format and media issues and using different software – we were keen to make this software agnostic and have gone back to the archival principles at the heart of the processes.

In many areas we found many similarities with existing practice with paper records and for some aspects we found there were multiple ways of achieving certain goals and we didn't want to be prescriptive in any way.  So instead it highlights key decision points and aspects of policy that may be determined at an institutional level and is intended to help people making the same journey that we have made – finding out about projects, tools & case studies and starting to build knowledge, skills & infrastructure.  

Although the publication of the White Paper officially marks the end of the AIMS project the institutions intend to continue collaborating and sharing their experiences on this blog.

We welcome your comments and feedback to the White Paper on this blog – whether you have implemented the framework or just found the guidance useful.

Thursday, 6 October 2011

Day of Digital Archives – some personal reflections

To mark the Day of Digital Archives I thought I would add a personal note about the “journey” I have made in the last two years. It was about this time in 2009 that it was announced that the AIMS Project was being funded by the Andrew W. Mellon Foundation and that I would be seconded from my post as Senior Archivist to that of Digital Archivist for the project.

At the time I had considerable experience of digitisation but very little about digital archives. So I began reading a few texts and following references and links to other sources of information until I had a pile of paper several inches high of things to read. At first there was a huge amount to take-in – new acronyms, especially from the frightening OAIS, and plenty of projects like the EU funded Planets initiative. It seemed that the learning would never stop – there was always another link to follow another article to read and it was really difficult to determine how much was making sense.

Talking to colleagues who were already active in this field also revealed how little digital media we actually had at the University of Hull Archives – just over two years ago we literally had a handful of digital media whilst others were already talking about terabytes of stuff. Fortunately the AIMS project sought to breakdown the workflow into four distinct sub-functions and placed emphasis on understanding the process compared to ‘traditional’ paper archives which reduced the sense of being overwhelmed by it all.

Since then I feel I have come along way – I have attended a large number of events and spoken at a fair few and quickly become both familiar and comfortable with the language. I do appreciate the time I have been able to dedicate solely to the issue of digital archives and that many colleagues are embracing this “challenge” without this luxury.

The biggest recommendation I can make is to start having a play with the software – many of the tools that we use at Hull University are free – Karen’s Directory Printer for creating a manifest of records including checksums that have been received; FTK Imager for disc images etc etc. Nor do you have to wait for digital archives or risk changing key metadata whilst you are experimenting – you can use any series of digital files or old media that are lurking in the way of a drawer. We have also created a forensic workstation and shared our experiences via this blog.

Once we had started to experiment, we created draft workflows and documentation and refined this as we experimented further – all tasks from photography of media to using write-blockers do become less daunting the more frequently you do them. Having learnt from many colleagues we have started to add content to the born-digital archives section of the History Centre website. I have also used some of my own email to play with the MUSE visualisation tool to understand how it might allow us to provide significantly enhanced access to this material in the future.

Although the project funding has now finished and I have returned to my “normal” job I do think that digital archives has now become part of my normal work and each depositor is now specifically asked about digital archives and in public tours of the building we explicitly mention the challenges and the opportunities of digital archives. We don’t have all of the answers yet – archiving e-mail in particular still scares me, but don’t feel as daunted as I did two years ago.

Sunday, 4 September 2011

A Tale of Two conferences

Last week I was fortunate to be part of the AIMS team presenting our work at the SAA Conference in Chicago. Despite the Saturday 8am start of our session and the impending threat of Hurricane Irene well over 150 delegates turned-out to hear our presentation which included both an introduction to the AIMS framework and reporting our practical experiences through case studies. If you missed it or want to relive it the presentations are available online via Slideshare.

On Friday I spoke at the ARA conference in Edinburgh – the theme of which was advocacy and as part of a Data Standards Group I spoke about the skill set that I had acquired during my change of role from archivist to digital archivist as a result of the AIMS project.

Although the two presentations were different in content and context they both included the same message – an attempt to breakdown the perceptions and myths surrounding born digital archives. In talking about skills in Edinburgh I sought to highlight the relevance of the traditional archive skills in the digital age and to encourage more individuals to do something.

It also raised the question – something that arose in the AIMS unconference in Charlottesville and the UK workshop in London, of when will digital archives become “the norm”. We don’t know the exact answer to this, but I do know it is necessary if we are to successfully manage the challenges of born-digital archives and strive to meet the increasing expectations of our users.

Friday also marked the end of a six month contract during which Nicola Herbert has helped us with the practical elements of digital preservation at Hull. I would like to thank Nicola for her hard work and direct users to her guest blogs on photography of media and write-blockers.

Friday, 2 September 2011

AIMS@SAA Part Two: SAA Session 502

SESSION 502 - Born-Digital Archives in Collecting Repositories: Turning Challenges into Byte-Size Opportunities
SAA 2011
Chicago, IL 
Aug 27, 2011

As the endnote to their foray into the SAA 2011 Annual Meeting, the AIMS Digital Archivists delivered a presentation on the AIMS project on Saturday morning. Although we were competing with Hurricane Irene’s effect on travel schedules, an 8 a.m. Saturday timeslot, and presentations from our colleagues Michelle Light, Dawn Schmitz and John Novak’s on delivering born-digital materials online as well as the Grateful Dead Archivist and a member of the band Phish, attendance was pretty darn good! We were pleased to be able to speak with some colleagues after the session and facilitate a few discussions during the question and answer portion of the session.

The presentation itself gave a brief overview of the project and then focused on the AIMS framework, or the four areas we’ve identified as key functions of stewardship for born-digital materials: Collection Development, Accessioning, Arrangement and Description, and Discovery and Access.

We’re very happy to share our slides here through slideshare. Remember, this is just a taste of what’s to come in the white paper this fall, so keep checking the blog for updates!

Slide are posted here after the jump!