Friday, 30 November 2012

ALiCat - 8 out of 10 archivists...?

ALiCat (Archival Linked-data Cataloguer) is one of the outputs of the step-change project. It is an editing tool for collection level records. Of course AIM25 has its own web based editing tool, the AIM25 archivists are also able to upload EAD and so can use desktop tools such as CALM or  whatever they choose to produce their records. There are other web-based EAD editing tools such as the archivehub's excellent EAD editor

Once the tortured acronym is expanded we can see that ALiCat is an attempt to allow archivists the ability to assign and amend the linkages between the resources and the pertinent terms both within the body of the record and those "access points" that are used for indexing the record.

So initially ALiCat presents a reasonably straight-forward tabbed form for inputing those ISAD(g) elements that archivists know and love.

One is alerted to the ALiCat's USP by the fact that the the index terms are ever present on the page and colour coded according to their type (AIM25 uses a subset of those EAD elements that can be contained within <controlaccess>).

The aim of the broader project was to assign persistent URIs and (the beginnings of) consumable semantic representations for both the AIM25 index terms and the records that they are associated with. Much of this was done retrospectively on the existing data. The role of ALiCat was to provide an interface for tweaking these enhancements (which were not always perfect) and to provide a possible method for archivists to include linked data in their records at the point of creation.

On a technical level ALiCat allowed the developers at ULCC (me) to demonstrate the use of the data- service. The architecture of ALiCat is completely reliant on the RESTful output (and input) of ALiCat uses the jQuery javascript framework manage requests to the and display the results.

Do archivists want the extra tasks associated with finding and assigning meaning - or at least that measning expressed at the end of a LOD URI? Well there are obvious benefits of recording these links, that I'm sure have been extolled at length in this blog and others. One of the problems that ALiCat is trying to solve is to assimilate this process into the workflow of the archivist in an efficient manner. The method ALiCat uses to try and solve this problem is to provide some useful and well integrated UI tools for suggesting and searching for URIs as the archivist edits.

Terms can be high-lighted by an editor who is then offered the opportunity to look up the term using a selection of LOD data services. The selection is relatively small at the moment but the exercise of consuming data formatted according to common and open formats (XML, JSON), standards (RDF), and vocabularies (including skos, geonames, foaf) should mean that the task of adding more 3rd party look-up services is well within reach.

Of course one of the services used to look-up terms was the This helped us to make sure that search service provided was more or less in line with others in similar fields.

If suitable results could be found the editor is given the opportunity to define a term according to the data structures used by AIM25-UKAT and coin a new URI in the domain.

In addition to looking up individual terms ALiCat an option is available for editors that will send all the text in a given ISAD(g) element to a suggestion service. This will be triggered when an archivist moves on from editing a given field (losing focus). The suggestion service will return return the text with 'found' terms tagged and highlighted and an associated URI(s) assigned. Some terms may need a few extra steps to disambiguate them, the process is much the same when looking up individual terms. The editor can then select terms by dragging them into the access point list on the right, suggested terms that are not selected are stripped out when saving. The suggestion services available for use are openCalais and the match service from (though this lacks the linguistic analysis of openCalais)

Once the record has been saved the linked data URIs are included both in the access points - rendered in the EAD thus:

<persname source="AIM25-UKAT" uri="">Rudolf Steiner</persname>

Also embedded in the text of the record as RDFa 

in 1913. It had its origins in the spiritual philosophy of <span property="" resource="">Rudolf Steiner</span> (1861-1925).