born digital archives: accession

Showing posts with label accession. Show all posts

Friday, 12 October 2012

Not a typical week

At the end of the AIMS project I returned to my post as Senior Archivist with digital archives added to my todo list alongside public searchroom duty, working with paper collections, responsibilities for maintaining our website and online catalogue, managing staff and volunteers etc etc.

This week has not been typical.

Monday
Accession two recent deposits including a small set of floppy disks created between 1995-1999 using a Psion (I think judging by some of the data visible using FTKImager). The other item was a CD with minutes created in the last couple of years by a charity – so nothing to worry about in terms of formatting but it did highlight issues around filename consistency. I contacted the depositor and they were happy to receive suggestions about future naming conventions which will be a great help. I was also able to ask about material that reflected the complete range of activities of the charity and hope that further material will be forthcoming.

Tuesday

One of the outcomes following the publication of the AIMS White Paper has been to share experiences with colleagues in other institutions. On Tuesday our guests were Nancy McGovern and Kari Smith from MIT and it was a great opportunity to share experiences and discuss aspects surrounding processes, workflows and tools. As always I came away with a list of other tools to try and research papers to look out for! We were joined by my colleague Chris Awre who talked about the work at Hull using Fedora for our institutional repository and in particular Hydra and the opportunity this offered for sharing development work.

Wednesday

Spent some of Wednesday preparing for a one day workshop at Cambridge about born-digital archives next week. The day is designed to encourage colleagues to take the first steps and will include colleagues from LSE and the Wellcome Library and will feature demonstrations of write-blocker hardware and tools including Karen’s Directory Printer and DROID.

Thursday

Received an email out of the blue from a colleague working in Vancouver, which was really nice – they had been following the AIMS Blog and wanted to ask some questions and I was happy to clarify a few aspects that had been mentioned. In replying I also sought more information about their own experiences and whether we had tackled email. Whilst we haven’t tackled this explicitly (yet) I have had a play with the MUSE tool which gives a unique perspective on the stuff with-in an 'mbox' file and offers a sentiment graph that instantly grabs you.

Friday

What better for a Friday afternoon than a quick spell of taking photographs of the floppy disks I accessioned on Monday. It took longer than it should have done due to lack of practice and the need to find something to prop up the disk so we could capture the information written on the edge of the disk – our conservator Christine found a small clear display stand that is ideal and this has been requisitioned for future photographic needs.

This hasn't been a typical week – I have probably done more in the last five days than the preceding two months - but then things rarely are in archives – and for many working in the profession the range and variety is one of the best parts of the job.

Friday, 2 September 2011

AIMS@SAA Part One: CREW Workshop

CREW: Collecting Repositories and E-Records Workshop
SAA 2011
Chicago, IL 8/23/2011

The AIMS partners hosted a workshop in the run-up to the 2011 SAA Annual Meeting in August. 45 participants from the US and Canada joined us in exploring the challenges, opportunities and strategies for managing born-digital records in collecting repositories.

The workshop was organized around the 4 main functions of stewardship that the AIMS project has focused on: Collection Development, Accessioning, Arrangement and Description, and Discovery and Access. In addition to the AIMS crew (no pun intended) presenting on the research done through the AIMS project, several guest presenters showcased case studies from their own hands-on approaches to managing born-digital materials. Seth Shaw, from Duke University discussed the evolution of electronic record accessioning at Duke University and his development of the Duke Data Accessioner. Gabriela Redwine discussed work done in arrangement and description at the Harry Ransom Center at the University of Texas at Austin. Finally, Erin O’Meara showcased work done at the University of the North Carolina at Chapel Hill to facilitate access to born-digital records through finding aid interfaces.

In between presentations the participants engaged in lively discussions around provocative questions and hypothetical scenarios. At the end of the event, the AIMS partners felt they had gained just as much from the day’s activities as they hoped the participants had. Ideas that were discussed and case study examples will help strengthen the findings of the white paper due out this fall.

See the workshop presentations after the jump!

Curators Workbench workshop

I was fortunate enough to attend the Curator’s Workbench workshop at the British Library last week. It was a chance to see, have a play and discuss the tool with its developers Greg Jansen and Erin O’Meara from University of North Carolina. The tool is designed to aid with the accession, arrangement, description and staging of material prior to ingest into a digital repository. Essentially the tool has an interface designed for archivists can use.

The session featured a walk-through and chance to have a play with experts on-hand if you had a problem – only necessary as we had latest ‘unstable’ release including the latest enhancements to functionality and GUI. Stable versions are available for download via GitHub. I am especially smitten with the crosswalk feature providing a drag’n’drop interface for mapping the metadata with METS. There is also the date recogniser which allows you to map the date format to the ISO standard, though there could be issues if the data is in a variety of formats, ie 1984 would be transformed to 01/01/1984.

It has a different take to where arrangement and description occurs in the workflow to that intended for Hypatia in the AIMS workflow, but it does raise some interesting questions that I hope to explore in more detail over the next few months.

It was also interesting to hear features and functionality on their wish-list including disc images, multiple users, recording processing notes, PREMIS and so the list goes on!
The discussion that followed was really enlightening as it highlighted the different approaches that archives are currently adopting to the preservation of born-digital archives.

I picked-up some useful pointers to software and tools I haven’t used before – Bulk extractor, Google Refine, and came away determined to throw more stuff at Curators Workbench, to join the users discussion list (done) and to figure out some of the aspects we have avoided so far things like PREMIS and METS etc !

Tuesday, 21 June 2011

Photographing the digital: creating images of Hull University Archives’ digital media

A guest posting from Nicola Herbert, Digital Project Preservation Assistant at Hull University Archives

Over the last few months I have been working with the AIMS team at Hull University. My role entails getting stuck into some practical processing of the born-digital collections in the Hull University Archives as well as planning aspects of digital preservation. A lot of our work so far has been to discover and document the material that we already hold in what we thought were purely paper collections and I have written a workflow for the discovery of these items and their preparation for ingest into Fedora. As part of this workflow we decided to photograph all of the removable media we currently have and create a process for photography of new deposits when they arrive.

Why bother?
By retaining photographs of the original media alongside content we will be able to provide an image of the appearance of the original media to researchers if they request it. For the foreseeable future we are storing the image files on a shared drive, but they will eventually be stored as an element of metadata with the digital files in our Fedora Repository. We will be dealing with large numbers of media items so need to ensure consistency in the way the media is photographed and information recorded from those images.

Process
Having not previously numbered the discs, we decided on a simple running number within each accession. Despite our familiarity with labelling paper material, it seemed more complicated with digital. Our conservator advised against sticking labels (even conservation grade) onto the plastic casing of a floppy or Amstrad disc. Though a specialist CD marker can be used to label CDs, we were reluctant to permanently mark the items! After a worryingly long thought process we decided to stick to the old faithful method of writing in pencil on the existing label or case.

I then started planning the process. Despite trying to anticipate the different elements of information to include for each media type, it was only trial runs photographing actual media that gave the full picture - i.e. that Amstrad discs have three aspects to photograph (Side A, Side B and the edge). Lots of seemingly trivial questions arose - like whether to photograph the case or whether to photograph a label if blank. Getting the process right from the start will save time in the long run.

We decided to create a ‘clapperboard’ to photograph with the items for a failsafe way to ensure easy identification. I decided on a reusable form printed on a transparency which we can label with a drywipe marker. Putting theory into practice needed several trial runs; after each one I adapted the form and the procedure.

In addition I wrote up detailed notes describing the procedure for each type of media we anticipate encountering. We worked out a sensible image quality – so to ensure legibility of the labels without clogging up our servers with unnecessarily large images. Once the photographs have been taken they are renamed and filed. We also maintain an inventory of the items and record the media and label information alongside it. This ensures that if we send items (like our Amstrad discs) away to a third party we can match them to our records when they return.

This process has been satisfying to complete and enables us to tick at least one thing off our to-do list. Anyone can get this part of the process completed – even for material which is stored on a shared drive, photography of the original media is a useful process.

Friday, 4 March 2011

File type categories with PRONOM and DROID

In order to assess a born digital accession, the AIMS digital archivists expressed a need for a report on the count of files grouped by type. The compact listing gives the archivist an overview that is difficult to visualize from a long listing. The category report supplements the full list of all files, and helps with a quick assessment after creation of a SIP via Rubymatica. (In a later post I’ll point out some reasons why pre-SIP assessment is often not practical with born digital.)

At the moment we have six categories. Below is a small example ingest:

Category summary for accession ingested files

data	3
moving image	1
other	2
sound	2
still image	26
textual	12
Total	46

Some time ago we decided to exclusively use DROID as our file identification software. It works well to identify a broad variety of files, and is constantly being improved. We initially were using file identities from FITS, but the particular identity was highly variable. FITS gives a “best” identity based meta data returned by several utility programs. We wanted a consistent identification as opposed to some files being identified by DROID, some by the “file utility” and some by Jhove. We are currently using the DROID identification by pulling the DROID information out of the FITS xml for each file. This is easy and required very little change to Rubymatica.

PRONOM has the ability to have “classifications” via the XML element FormatTypes. However, there are a couple of issues. The first problem is that the PRONOM team is focused primarily on building new signatures (file identification configurations) and doesn’t have time to focus on low priority tasks such as categories. Second, the categories will almost certainly be somewhat different at each institution.

Happily I was able to create an easy-to-use web page to manage DROID categories. It only took one day to create this handy tool, and the tool is built-in to Rubymatica. The Rubymatica file listing report now has three sections: 1) overview using the categories 2) list of donor files in the ingest with the PRONOM PUID and human readable format name 3) the full list of all files (technical and donor) in the SIP.

This simple report seems anticlimactic, but processing born digital materials consists of many small details, which collectively can be a huge burden if not properly managed and automated. Adding this category feature to Rubymatica was a pleasant process, largely because the PRONOM data is open source, readily available, and delivered in a standard format (XML). My thanks and gratitude to the PRONOM people for their continuing work.

http://www.nationalarchives.gov.uk/PRONOM/Default.aspx

http://droid.sourceforge.net/

As I write this I notice that DROID v6 has just been released! The new version certainly includes a greatly expanded set of signatures (technical data for file identifications). We look forward to exploring all the new features.

born digital archives

Pages

Friday, 12 October 2012

Not a typical week

Friday, 2 September 2011

AIMS@SAA Part One: CREW Workshop

Monday, 4 July 2011

Curators Workbench workshop

Tuesday, 21 June 2011

Photographing the digital: creating images of Hull University Archives’ digital media

Friday, 4 March 2011

File type categories with PRONOM and DROID

Search this blog

Labels

Our Favourite Blogs

AIMS Project links

Blog Archive

born digital archives

Pages

Friday, 12 October 2012

Not a typical week

Friday, 2 September 2011

AIMS@SAA Part One: CREW Workshop

Monday, 4 July 2011

Curators Workbench workshop

Tuesday, 21 June 2011

Photographing the digital: creating images of Hull University Archives’ digital media

Friday, 4 March 2011

File type categories with PRONOM and DROID

Search this blog

Labels

Subscribe To

Our Favourite Blogs

AIMS Project links

Blog Archive