General aims of prototype
- To improve on the usability of the existing offering
- To make use of semantic annotation within the tool
- To use linked data to enhance the user experience of the AIM25 website
- a) Eliminate the need for archivists to use mark-up
- b) Integrate the indexing process with the metadata recording process
- c) Reduce page scrolling and generally improve usability
- a) Analyse the textual input against existing authoritative sources both external and internal
- b) Suggest and record indexing terms derived from the analysis
- c) Record the semantic properties of terms
- a) Mark-up the semantic properties of indexed terms both within the ISADG display and within the “Access points” lists
- b) Provide links to related services based on the semantic properties of the terms
For the prototype we took a snapshot of the AIM25 database and put this on the OMP test server (http://data.aim25.ac.uk).
The access points are displayed on the right. Terms are colour coded according to the four term types:
The prototype workflow uses one external service (openCalais) directly to analyse text and suggest useful terms for indexing. AIM25's existing index is also interrogated, this dataset includes the UK archival thesaurus. As a result an AIM25 text analysis service was developed.
For rapid development this service runs boring old SQL on the existing AIM25 data tables, but as there is already a mechanism to transpose this data into RDF (and more) a more robust semantic solution is theoretically a short hop away.
Archivists can use check-boxes by the side of each textarea in the workflow form to select ISAD(G) elements for analysis. The selected text is sent for analysis by one or both of the services and results are displayed in two ways.
- As embedded mark-up in the "textarea"
- As term lists in the Analysis/Indexing area
Above is an example of a list of terms returned from the AIM25 service . The term "Weaving" is in the process of being added as an access point for this record.
Here we see the same results embedded in the text. These are a smaller set as they only include the exact matches. When saved, terms not added to the access points are stripped out. Those that remain can be represented in context as RDFa.
Here the results returned by OpenCalais are embedded and below they are displayed as a list so that they can be added to the access points. Also below are the results of a direct lookup on the AIM25 service so that archivists can add access points for terms that do not appear in the text.
Did we achieve any of aim3? More to follow on this soon...