Wednesday 24 October 2012

Practical First Steps

Last week I helped organise a training day on born-digital archives for the East of England Regional Archive Council. I was joined by Chris Hilton from the Wellcome Library, Ellie Robinson from LSE and Grant Young from Cambridge University Library. The day followed a similar pattern to an event hosted in Hull last November. There were four main elements to the day:

Institutional Overview
The four of us gave a brief overview of the development of digital preservation in our respective institutions and included Chris’s now legendary simplification of OAIS to "Get Stuff - Put stuff somewhere - Keep stuff safe & Show stuff to people".  Ellie talked through the development at LSE from a risk analysis perspective to get institutional backing to then moving on to actually doing it - the latter sentiment being one of the mantras for the day. Grant talked about his work with digital content - much of it digitised rather than born-digital but now occupying an eye-watering 67TB (both LSE and Hull have about 120GB of born-digital material).

Practical First Steps
The four of us then gave a short presentation offering some practical tips; I looked at conducting a survey to identify material already held in the archives and how this often meant the media had been accessioned but not the contents! Chris shared the experiences at Wellcome of 'Dealing with depositors', Ellie looked at 'Handling born-digital material' including accessioning, virus check and other stages at LSE and Grant talked about 'Issues around File Formats' highlighting a number of challenges and suggesting strategies that could be adopted.

Questions and Answers

The day also included two question and answer sessions designed to get delegates talking about the particular aspects and issues of concern to them. Questions touched on a range of topics including depositors, DRAMBORA, how to approach hybrid collections and depositor agreements. We also heard of work being conducted in a number of local authority archives and hopefully they will share their work and experiences with colleagues in the near future.

Delegates were split into four groups and given demonstrations on using Karen's Directory Printer, DROID and also using FTK Imager with a write-blocker to read a PC hard drive (from my garage) the fourth diversion was a look at two different born-digital scenarios for delegates to consider how they might respond.

There was common agreement on the need to do something, and widespread acknowledgement that there wasn't a single solution or approach. Wellcome, LSE and Hull were all looking at the issue of bulk-ingest into repositories whilst retaining the relationships between files as represented through an often complex series of folders. It so happens that at Hull one of our developers is looking at this very issue so I hope to have an update on this in the next few weeks.

A key theme of the day was collaborating and helpline colleagues and in this spirit all of the presentations are now available on the Hull History Centre born-digital archive pages - thanks to all of the speakers for making this an interesting and informative day.

Friday 12 October 2012

Not a typical week

At the end of the AIMS project I returned to my post as Senior Archivist with digital archives added to my todo list alongside public searchroom duty, working with paper collections, responsibilities for maintaining our website and online catalogue, managing staff and volunteers etc etc.

This week has not been typical.

Accession two recent deposits including a small set of floppy disks created between 1995-1999 using a Psion (I think judging by some of the data visible using FTKImager).  The other item was a CD with minutes created in the last couple of years by a charity – so nothing to worry about in terms of formatting but it did highlight issues around filename consistency. I contacted the depositor and they were happy to receive suggestions about future naming conventions which will be a great help. I was also able to ask about material that reflected the complete range of activities of the charity and hope that further material will be forthcoming.

One of the outcomes following the publication of the AIMS White Paper has been to share experiences with colleagues in other institutions. On Tuesday our guests were Nancy McGovern and Kari Smith from MIT and it was a great opportunity to share experiences and discuss aspects surrounding processes, workflows and tools. As always I came away with a list of other tools to try and research papers to look out for! We were joined by my colleague Chris Awre who talked about the work at Hull using Fedora for our institutional repository and in particular Hydra and the opportunity this offered for sharing development work.

Spent some of Wednesday preparing for a one day workshop at Cambridge about born-digital archives next week. The day is designed to encourage colleagues to take the first steps and will include colleagues from LSE and the Wellcome Library and will feature demonstrations of write-blocker hardware and tools including Karen’s Directory Printer and DROID.

Received an email out of the blue from a colleague working in Vancouver, which was really nice – they had been following the AIMS Blog and wanted to ask some questions and I was happy to clarify a few aspects that had been mentioned. In replying I also sought more information about their own experiences and whether we had tackled email. Whilst we haven’t tackled this explicitly (yet) I have had a play with the MUSE tool which gives a unique perspective on the stuff with-in an 'mbox' file and offers a sentiment graph that instantly grabs you.

What better for a Friday afternoon than a quick spell of taking photographs of the floppy disks I accessioned on Monday. It took longer than it should have done due to lack of practice and the need to find something to prop up the disk so we could capture the information written on the edge of the disk – our conservator Christine found a small clear display stand that is ideal and this has been requisitioned for future photographic needs.

This hasn't been a typical week – I have probably done more in the last five days than the preceding two months - but then things rarely are in archives – and for many working in the profession the range and variety is one of the best parts of the job.

Saturday 9 June 2012

Brith Gof

Last week I was fortunate to be invited to a two day workshop hosted by the National Library of Wales (NLW) at Aberystwyth. The workshop brought together archivists, academics, library and ICT staff and students to discuss the “challenges” surrounding the Brith Gof and Clifford McLucas collections. The primary focus was on aspects of the digital preservation – with a range of media and carriers including floppy disks, zip drives, SyQuest cartridges (a new one to me) and Mac Book G4 amongst these two collections.

Discussions touched a wide range of topics and issues including digital forensics, cataloguing hybrid collections, digital curation, emulation and access. There was also a reminder of the complexity of intellectual property rights for performance material where different rights might be held for the set design, the score etc etc.

There was also a public event where a number of items from the collections were used as prompts by Professor Mike Pearson, co-founder of Brith Gof, to stimulate his recollections of Brith Gof. The range of items selected highlighted the rich nature of the collections that includes photographs, videos, set designs, huge banners (aswell as the more traditional paper archives). One aspect that is also being considered is how to capture/record the impact that Brith Gof had on those watching the performance.

I talked about the AIMS project and highlighted a number of the questions and issues that had arisen from our shared experiences that we have captured in our White PaperI really enjoyed the discussions, the informal nature and the reminder about making a deliberate effort to engage and attract a range of audiences.

Today is International Archives Day (twitter #archday12) to see what everybody has been saying.

Wednesday 4 April 2012

Forensic workstation pt 4

Earlier parts of this series had touched on identifying our needs and requirements from a workstation (see part 1), re-purposing an old PC into our first workstation (see part 2) and our early experiences of write-blockers and FTK Imager (see part 3)

Our experiments with our write-blockers have been limited, but each time we get them out of their boxes they seem a little less scary. The recent deposit of some born-digital audio and video material totalling over 200GB has thrown-up a number of new issues for us to consider.

The files came to us on an external hard drive formatted for a Mac that we needed to return to the depositor at which point we knew the files would be deleted - placing greater emphasis on the need to get the capture process correct as we wouldn't be able to return to the depositor and try again!

We were unable to browse the files in Windows Explorer, but were able to see the files using FTK Imager and our USB write-blocker. The sheer size of the files is something we are going to have to get used to with a 45 minute QuickTime film is 10.1GB and a 43 minute wav file is 671MB.

The workstation already has PaintShop Photo Pro for viewing and converting image files but only the standard viewers for audio and video content. So we started to look for open source software for viewing and converting the audio and film files, I wanted something that had a graphical and not a command line interface, as I was keen for other staff to develop skills and experience in handling this type of content.

As with our earlier use of tools like Karen's Directory Printer and DROID once we have become familiar with software we then document our use by creating a simple 'Idiots Guide' - this allows us to record both issues and solutions that we have encountered.

A bit of browsing and a few recommendations later and we have now installed Audacity v2 and WinFF but we will also take a look at others including Handbrake and FFmpeg before making a final decision.

We are keenly awaiting the forthcoming release of the DPC Technology Watch Report on 'Preserving Moving Pictures and Sound' and revisiting the FutureArch blog entries on media formats.

Thursday 22 March 2012

UKAD Archives Discovery Forum 2012

A few reflections on the UKAD Archives Discovery Forum 2012 that I attended yesterday

The day started with a really interesting keynote piece from Bill Thompson (BBC) about his role in using the archives to forge partnerships with a range of organisations and he highlighted the range and diversity by talking about several projects including a project this summer with the Arts Council, the centenary of the First World War and an exciting initiative called Digital Public Space (diagram on collectionslink website)

Joy Palmer (MIMAS) gave a talk about the JISC Discovery programme and the ongoing work to demystify aspects like APIs, persistent URIs, user interfaces and measuring both impact and value.

Teresa Doherty (The Women's Library) then spoke about name authority records and how you can help make your collections more discoverable by adding links to the archives from relevant biographical pages on Wikipedia - something we started to do at Hull several years ago but this was a useful reminder to revisit this simple approach that can have a huge impact on the visibility of your collections.

After a great networking opportunity called lunch Lindsay Ould (Kings College London) talked us through the JISC funded FIDO project (Forensic Information in Digital Objects) and their experiences, highlighting a range of technical, skills-based and ethical issues and also their use of OS Forensics software.

I then gave a presentation about born-digital archives, but took a different approach - instead of focussing on the work we have undertaken at Hull I presented a very brief SWOT analysis to highlight many of the issues we have experienced.

There was then a series of short presentations including Sam Velumyl (The National Archives) who gave an overview of the TNA Finding Archives project , Teresa Dixon (West Yorkshire Archives Service) spoke about the History to Herstory website which features over 80,000 images including the Amy Johnson letters held at the Hull History Centre.

Kimberly Kowal (British Library) spoke about a crowd-sourcing map project which saw 725 maps geo-referenced in a week (see a blog entry about this project) and Alison Cullingford spoke about the Research Libraries UK Unique and Distinctive Collections project.

The final session was from Bill Stockting (British Library) about the completion of the Integrating Archives and Manuscripts System - bringing a vast number of legacy data sources and systems and over 1.5m records together and this now sits behind the site.

Despite all of this there were a host of other sessions I would like to have attended including linked data, the National Archives new catalogue and the Old Maps online project.

Update 2nd April - slides from the sessions have been added to The National Archives website, see the Documenting Collections page

Monday 19 March 2012

Archives and Society

Two weeks ago I spoke at the Archives and Society series of tasks held at the Institute of Historical Research about the progress and work at Hull as a result of the AIMS project. Whilst highlighting the AIMS White Paper the bulk of the talk was about the practical steps we had taken at Hull with born-digital archives; starting with a simple survey of collections and then followed by photography of media and creating a forensic workstation (a tale told in multiple parts see - part 1, part 2 and part 3).

I sought to encourage those present to download software like Karen's Directory Printer and DROID and to have a go - using a few test files will help increase your familiarity with many of the issues associated with digital preservation.

I managed to stop in time for questions - and these included aspects relating to the fact that the issues I raised were not "new" and whether we would still be making the same case in 5 years time (I hope not) and the need for automated tools to help us cope with the sheer volume of material (an obvious need) and the associated risk of releasing material that you haven't explicitly checked because of the sheer volume of files..

A PDF version of the slides is available - the talk was also recorded and I will add a link to the podcast when it is available.

Friday 20 January 2012

AIMS White Paper now available

After a huge amount of effort the AIMS White Paper has finally been finished and is now available online.

The White Paper is intended as a framework for guide good practice in terms of archival tasks and objectives necessary for success. It builds upon the experiences of the four partner institutions - the universities of Hull, Stanford, Virginia and Yale - to process a range of collections with an array of format and media issues and using different software – we were keen to make this software agnostic and have gone back to the archival principles at the heart of the processes.

In many areas we found many similarities with existing practice with paper records and for some aspects we found there were multiple ways of achieving certain goals and we didn't want to be prescriptive in any way.  So instead it highlights key decision points and aspects of policy that may be determined at an institutional level and is intended to help people making the same journey that we have made – finding out about projects, tools & case studies and starting to build knowledge, skills & infrastructure.  

Although the publication of the White Paper officially marks the end of the AIMS project the institutions intend to continue collaborating and sharing their experiences on this blog.

We welcome your comments and feedback to the White Paper on this blog – whether you have implemented the framework or just found the guidance useful.