CIG16 Write Up

Below is a summary of our recent conference on Innovation and Discovery written by Emma Booth from Metadata Services at LSE.  Many thanks to Emma for giving us permission to include this on our blog.

Earlier this month, the biennial conference of the Cataloguing and Indexing Group took place at Swansea University’s Bay Campus, focusing upon metadata innovation and discovery.

The conference demonstrated how libraries, archives and museums are all striving to improve the quality of their metadata in order to enhance resource-discovery for their users. Papers and presentations covered a range of interesting and innovative metadata enrichment and quality- improvement projects, including collaborations between libraries, archives and special collections.

Several of the presentations revealed how refinements in metadata standards and the adoption of Linked Open Data formats such as BIBFRAME are enabling librarians to acquire new skills in metadata creation and manipulation, whilst simultaneously improving the discoverability of library-resources on external systems via the web. This is due to the fact that Linked Open Data standards allow bibliographic metadata to become compatible with web-data standards, and so be indexed by web-based search engines, rather than being hidden away in the library’s local catalogue or repository.

Furthermore, Linked Open Data standards enable users to explore the relationships and links between different works, individuals, events and places, which can open up new avenues for cross-disciplinary research. This means that library collections can expand their discoverability from local to global audiences and have a wider impact upon research and learning communities. As such, Linked Data projects enable an institution to shift towards a more ‘user-centric’ approach to resource discoverability, acknowledging the fact that researchers often choose to use external systems, tools and platforms to search for information, rather than just using a library catalogue.

Throughout the conference there were examples of the fundamental work that cataloguers and metadata librarians are doing on a daily basis in order to ensure that collections are made discoverable and accessible. Many libraries are investing time and staff resources in upgrading their legacy metadata records from old standards, and are steadily FRBRising their library catalogue in order to make its content more discoverable to users.

Many of the papers also expressed the view that, whilst the work of the metadata team is often hidden away from public view, cataloguing and metadata practices and workflows, together with systems and discovery layers, ultimately determine the user experience and, therefore, the user’s impression of a library’s quality. Without good quality, standardised bibliographic metadata it is impossible for a library-user to know what resources are in a library’s collections, whether they are relevant to the their research, how they relate to materials they have already accessed, or how to gain physical or electronic access to those resources. In essence, without bibliographic metadata there is no library!

The overall feeling of the conference was that metadata librarianship is in an exciting place, with great opportunities for expansion and innovation opening up through projects involving Linked Data. However, there was a feeling that cataloguers and metadata specialists need to be more vocal advocates for the work that they do, and for the importance of metadata enrichment projects at their institutions as a means of enhancing the user-experience and improving the discoverability of library collections.

Slides, workshop materials and posters from the conference can be found here.

RDA in a Day

We are pleased to announce a repeat of the successful RDA in a Day training course – and this time, we’re offering it in two locations (Sheffield and London) to make it available to a wider audience!

RDA in a Day is a practical introduction to cataloguing with RDA: Resource Description and Access, led by two RDA specialists from the British Library. The course will cover the FRBR model and RDA terminology. This is an interactive, hands-on course in which trainees will learn by using RIMMF and the RDA Toolkit to create RDA records. The day also covers creation of RDA records in MARC 21. Prior cataloguing experience and knowledge of AACR2 and MARC 21 will be an advantage.

Registration
Closing date for bookings: 30 September 2016

Sheffield

Date: Wednesday 12th October 2016
Time: 10.00-17.00 (lunch included)
Venue: The University of Sheffield, Sir Robert Hadfield Building, Mappin Street, Sheffield, S1 3JD.
Cost: £95 + VAT (CIG members), £120 + VAT (non-CIG members)
Closing date for sponsored place applications: 25th September (applicants will be notified by 3rd October)
Closing date for bookings: 30th September
Contact: Emily Bogie (e.berrisford@sheffield.ac.uk, tel: +44 (0)1142 220534)

London – Please note that this event is now fully booked. Contact Nicky to be added to the wait list.

Date: Tuesday 25th October 2016
Time: 10.00-17.00 (lunch included)
Venue: CILIP HQ, 7 Ridgmount Street, London, WC1E 7AE.
Cost: £95 + VAT (CIG members), £120 + VAT (non-CIG members)
Closing date for sponsored place applications: 2nd October (applicants will be notified by 10th October)
Closing date for bookings: 14th October
Contact: Nicky Ransom (nransom2@ucreative.ac.uk, tel: +44 (0)1252 892739)

Sponsored place
We are pleased to be able to offer a sponsored place at each of these events. Applicants must be CIG members (though CILIP membership is not required), and the application (ca. 200 words) should demonstrate why they would like to attend, how they would use their attendance to highlight or promote CIG’s area of interest, and if/why they would not be able to attend without CIG sponsorship. We would like the sponsored delegates to write a report/summary to be publicised on the CIG blog and/or journal.

Please submit your application to Emily Bogie for the Sheffield event or Nicky Ransom for the London event by 25 September 2016.

Applicants will be notified whether they have been successful by 3 October 2016.

Cancellation policy
Please note our cancellation policy: once your place is confirmed, we are unable to arrange refunds if you are subsequently unable to attend the event. Should this circumstance arise we are happy for someone else to attend in your place, but please notify us in advance if at all possible.

CIG Conference 2016 – Feedback

The CIG Conference is over for another year. Although we had a great time and hope you did too we are always keen to hear how we could make the experience even better.
If you attended CIG 2016 – Innovation and Discovery – let us know what you thought by taking a quick survey here.
We are also interested in hearing from you if you didn’t attend. Please complete the survey here.

Informed Peer Recognition Award

The Informed Peer Recognition Award is seeking volunteer judges to participate in the inaugural award.  No special experience or knowledge is required, and colleagues from across the information professions are encouraged to apply to take part in the judging process. In particular, they would like to invite participants from the public sector, school, and specialist library areas, to try and ensure a good mix of professional experience and knowledge in the judging teams.
If you are looking for a way to become more involved with professional activities, then this could be an excellent CPD opportunity.  Training and administrative support will be provided to all judges, and more detailed information on the judging process and stages is available here and here.  FAQs on other elements of the process, including planned timescales, are available on the Informed website here.
 
Applications to take part in the judging process can be made via a simple expression of interest sent to nominations@theinformed.org.uk , by the 3rd of October, 2016.
 
LOD2a

Metadata & Linked Data seminar – live blog

hashtag #cigslod16 follow @cigscot

Welcome one and all to the Cataloguing and Indexing Group in Scotland 5th linked data event, we’re in Edinburgh on the 12 of September.

This year we are delighted to welcome speakers from the British Library, the Bibliothèque Nationale de France, the Universities of Edinburgh and St Andrews, the RDA Steering Committee, and the National Library of Scotland.

Speakers will describe the practice and challenges of implementing linked data in a library and information environment, from local pilots, projects and experiments, to national services.  The opportunities and challenges that linked data presents to cataloguers, libraries and the wider information landscape will be explored, with speakers describing their organisation’s experience, as well as providing an insight into national metadata strategies.

Kicking the day off we have Janet Aucock (JA), St Andrews University Library, providing a cataloguer’s linked data perspective.
Follow St Andrews University Library

Janet is treating us to some gorgeous St Andrews pics, a quick description of some of the special collections and archives that St Andrews manage and preserve. The idea of linked data and the technical infrastructure needed could be a barrier and how will it fit in within the context of libraries and special collections.

Senior managers are not at the point where they are considering the strategy and how to move linked data forward within the libraries.  JA is thinking about how the discovery service and research data might fit into the practicalities of linked data and how to manage and exploit access to thesis data, enrich the cataloguing space and collections such as rare books.  We need a framework to look at how to link data throughout the institutions.

St Andrews like most other organisations have pots of data throughout the organisation,  digital collections, photographic collections, research repository, digitised collections and research publications and research data system.

JA is considering some use cases:

  • Names and naming authorities might be an area that could best be developed… or ORCID for the living and name authorities for the dead! Need a method for determining which naming convention structure to use for which sets for data.
  • Biographical register  that also links geographical and also link to borrowing register information.
  • Repository  ChemSpider, text mining of chemistry data to pull out chemical compounds authority not controlled, but possibility of the compound information being fed back to our repository.
  • SAULcat alchemy collections  which looks at provenance, binding , names and subject headings which is flat data not linked to anything and nothing to be done within the LMS with this data but if there was ways to link this data to other useful  data sets  that woudl be great.

Also looking at what LMS suppliers are considering for linked data, looking at name authorities and looking at identifying things at library level.  Hopefully now that linked data is on the suppliers horizon and hopefully this will help push forward the developed of linking data in  a library setting.  Opportunity to be had for linking data and what is available out there and hopefully that managers will realise the benefit.

Alasdair MacDonald & Ruby Wilkins, Edinburgh University Library, describing a project to link authorities across local datasets

Follow Edinburgh University Main Library

Library team asked to put forward ideas for a project for the innovative fund to look at personal name authorities across catalogues and look at how to use these through many data sets in the university.  Selected a number if significant people and focused on linking data within the  image archives,  historical individuals and consider other who may have links to the university. First projects was the Edinburgh Seven, first seven women who matriculated to study medicine at the Uni.  Also included was James Miranda Barry, who lived as a man, fought more than one duel and was a physician with a good bedside manner, some debate around James actually being a woman!

Once subjects determined start looking at a scoping methodology, looking at digital data sets online, Wikipedia and also names across all data sets in the university. Looking at Alma LMS, Vernon (archives databases)  Pure research system and the Discovery system Primo. The historical figures chosen were than researched to find in each of the systems. Checking against LCNAF, VIAF and ISNI.

Looked at any potential for linking data further in their main catalogues, it could pull the LCCN but only at point of cataloguing in hand, could we restore batch update but what would you do with differentiated names issues with linking thesis to the wrong names so batch update not the answer.

Can put URIs into the Vernon system but only as a reference. Archives Space does allow both EAD and ‘authorised’ name to co-exist.  The Discovery system Primo can hold URI data but issues between whether a person has written something or is the subject of a book. Looking at the potential of using the LCCN as a matching point rather than the authority.

Moving forward EUL looking at adding URIs to authority records in Vernon and ArchiveSpace, investigating the use of the $0 field in Marc format and also looking at exporting and editing Luna metadata.

Visit the http://images.is.ed.ac.uk/

Alexandra De Pretto, National Library of Scotland, describing experiments with linked data at the national library.

NLS has programmes of digitisation which is increasing and of course more resources, more metadata! NLS uses many interfaces, datasets and systems and the NLS hopes to connect anyone to relevant library resource and hope to enable search over the whole of the library space and also look at data beyond the control of the NLS.

NLS don’t have a metadata strategy but do use internationally recognisable open  standards but possibly not in a consistent way.   The NLS have published their metadata for their digital objects DOD element set on Open Metadata Registry.

Alexandra asks  is linked data the solution to enable search over disparate datasets for the NLS.   For linked data every resource described should be identified with its own URI, you can learn more about linked data at Library Juice Academy

Alex describes how to get started, you need triples and you need URIs and how this can be achieved by looking at your own datasets as a starter. A good example of a large linked data set is DBpedia.

Looking at their images archive she can begin to determine triples such as who of [resource1], who type [photographer] and who depicted in [woodcutter], working with these RDF triples with descriptive metadata held in different schemas  and using linked data technologies to then query them and present.  The triples data data repository  can store and index triples and also allow a means of managing and accessing triples with SPARQL usng SPARQL the NLS should be able to present results using more than one data set.

NLS at the beginning of this experiment, need to define vocabularies for datasets, work out URIs, consider the mappings and develop a publishing platform and teams need the right skills and experience to achieve this.   Worth checking out the  Library of Congress Linked Data services and the following links for further support and guidance:

  • RDA registry
  • RDFS: data-modelling vocabulary for RDF data
  • OWL: Web Ontology Language
  • SKOS: Simple Knowledge Organization System

After our coffee refreshments Torsten is joining us via Skype.

Dr Torsten Reimer, Imperial College London (ICL) will be providing an overview of ORCID and the benefits for global scholarly communication systems.
Follow Torsten Reimer

ORCID offers a unique researcher ID that allows humans and machines to reliably identify the authors of scholarly outputs. Within just a few years ORCID has had rapid uptake with over 2.4m researchers registered globally. Publishers, funders and research institutions are supporting, and in some cases even mandating the use of ORCID. Torsten is neither a librarian or a cataloguer (shock!) but works between the researcher and the university space.

🙂 Torsten is discussing the issue with identifiers and how he has been confused with another Torsten Reimer who works within the realms of psychology research, so names are not a useful unique identifier! ORCID provides a persistent digital identifier, it offers member integration and connect their researchers within an institution. ORCID also provides a hub between  machine-readable connections.

ORCID is a not for profit membership based organisation, once registered you receive a randomly assigned number and individuals control their own IDs and profile. Profiles can include informaiton on works, grants, employment history and publications. Once you publish you share your ORCID iD with the publisher and can add the ID to the metadata for your content.

ORCID  is helping ICL to keep track of their data and traffic over the Janet network. Publications tracking is available and helps with  data flows between systems however some issues still arise with current workflows:

  • Requires academic to login and add sources and articles
  • Authorising of articles not always recognised reliably
  • Pre-publication information would be useful to help document and track

UK funders  have specific controls in place to meet policy requirements.  Some of the requirements can be helped using services such as the Jisc Publications Router https://pubrouter.jisc.ac.uk/about/institutions/  that can link via the iD  CRIS, CrossRef and shares ORCID iD with publisher.

Tracking research data can use similar workflows using ORCID by sharing with a repository or embedding within the content. ICL started a project in 2014 to raise awareness of ORCID and to encourage academics to self register and update their profile and to continue to manage their iDs.

The ORCID project identified 764 existing iDs linked to College staff and created 3,226 new ones. ORCID is becoming the new research identifier although not all the systems are ready or integrated.
ORCID can improve interoperability and aid the transfer of information about researchers and their outputs when they move organisation.

Read more about some of the work that Imperial College London has completed looking at ORCID.

https://spiral.imperial.ac.uk/bitstream/10044/1/19271/2/Imperial%20College%20ORCID%20project.pdf

https://repository.jisc.ac.uk/5876/1/Imperial_College_ORCID_project.pdf

Visit the Jisc ORCID consortium at https://www.jisc.ac.uk/orcid

Alan Danskin (AD), British Library, describing linked data initiatives and BL metadata strategy
Follow Alan Danskin

BL has created their metadata strategy, AD reckons the future is bright for metadata and where is linked data in the vision for the British Library.  BL has three main sites.

BL Act of 1972 records the BL role as national centre for bibliographic and information services.  Some of the BL metadata service originally offered priced services & evolved through many technologies, began to offer open data in 2010 when the BNB was made available as linked data. 2015 saw the publication of the first metadata strategy for the British Library.

Many challenges for the BL but in 2013 regulations changed so that BL can now collect digitally formatted content. 100,000 new printed books received by legal deposit compared to 50,000 electronic books coming into legal deposit from about 10 publishers.  A lot of the content received is back catalogue content and not just a UK imprint but international. The challenge is how to catalogue such large amounts of content.

Some challenges to contend with such as hidden metadata, obsolete formats, printed catalogues and legacy metadata from catalogues that have not been digitised anywhere. Legacy metadata challenges are about data being recorded that are not necessarily easy to translate for the requirements of machine readable and linked data, an example of this can be the publisher details and the language of the content.

People are now interested in the ‘bigger picture’ questions such as discovery and research of collections development and being able to facet this into  language or country would be useful but not possible to do this with legacy.

Another challenge in legacy data is the silos within the organisation between MARC, and the Aleph LMS, archives and manuscripts using IAMS, XML variants such as ETOC &AMED, sound and recroded sound archives using internal SAMIMARC, and web content using Dublin Core.

So these are some of the issues that the BL would like to address and have championed progression by showing staff what they could and couldn’t do without metadata and a strategy.

Collection metadata identifies attributes & relationships, location & availability and the status & rights that allow you access to content. It requires stewardship and leadership ensuring its preservation and continued management over time and to achieve this it requires resourcing which can aid efficiency and improve services.

The BL have put in a structure of how metadata is used and managed within the library by staff from senior management who ensures the metadata strategy is being delivered, an advisory group  who can support  working towards the achievement of this strategy and also a working group who can alert any changes and updates to metadata used and can review and agree anything that is being proposed.

BL has looked at business cases and representation of that metadata and how it can be used within the organisation and there is now a Head of Collection metadata who has overarching responsibility for metadata developments.

BNB has 3.7m entries for UK books, it’s reusable and open  as it has a permissive license CC0.BL are currently looking at going out to tender for a new open data platform. Over 1500 users using  BL open metadata and they hope to increase access and reuse of this. BL are hoping to break down silos and converge the standards and exploit synergies with other data sets. Linked data is a p0tential solution but for many not an objective.

Check out the data services of the BL at:

http://www.bl.uk/bibliographic/datafree.html

http://bnb.data.bl.uk/

Metadata strategy 2015-2018

https://www.bl.uk/bibliographic/pdfs/british-library-collection-metadata-strategy-2015-2018.pdf

Mélanie Roche, Bibliothèque Nationale de France, describing linked data initiatives at the BnF.
Follow Melanie Roche

The National Library of France has successfully developed linked data applications that have received worldwide consideration. Melanie was inspired by a presentation called Let’s make it happen, linked data in libraries’ feeling energised and a call to arms for librarians but has since felt a little disappointed that this has possibly not been achieved as much as she’d hoped! With most initiatives still at project level.

BnF has a main catalogue of almost 19 million records detailing with general collections and a separate manuscripts database, BnF has linked authority files and bibliographic info back to 1975 that can aid linking between the main catalogue and the archives a database.

The BnF wanted to give users the opportunity not have to come to the catalogue to search for content but could use the data.bnf.fr service to find all information and content helped from both main and archives collections.  BnF used an algorithm to bring together all data from the digital library and the main catalogue for any given controlled authority.

BnF are using these algorithms to automate and to help them FRBR-ise their catalogue, it can automatically generate work records for the open data site  http://data.bnf.fr but also use that data to be included back in the catalogue generating over 100,000 records.

The other area that BnF are working is, is a triple store called SPAR Scalable Preservation and Archiving Repository, long term preservation of digitally native documents. SPAR is a modular OAI compliant repository. The data is stored in an RDF format to ensure librarians continue to curate this data and not within an IT department.

Another linked data project is the Doremus project which covers open data for music material, currently in the modelling phase for the project looking at the model for music data, using RDF. BnF hope to use all of these projects to help to develop a nationally facing open data house and considering many other types of format and content.  Read for further information Doremus: aligning value vocabularies

Melanie discusses should we upgrade MARC to accommodate open data as it’s not fit for purpose currently…MARC is dead…long live MARC!🙂

Visit for further info http://data.bnf.fr/about

Gordon Dunsire, RDA Steering Committee, describing the work carried out within the RDA community.

Follow Gordon Dunsire
rscchair@rdatoolkit.org

Gordon is using examples of RDA data from the RDA toolkit to show how it is transformed into linked data and discusses the benefits for users, for more of his work or other presentations visit http://www.gordondunsire.com/presentations.htm

http://www.rda-rsc.org/

RDA Toolkit http://www.rdatoolkit.org/

All examples of layout are available at the RDA Registry http://www.rdaregistry.info/
and other available in the rballs service RDA data, Jane-athons, etc. http://www.rballs.info/

RIMMF (RDA in Many Metadata Formats) is a free service available to create your own RDA sets and allows you to view the WEM relationship of the full record. RDA doesn’t have an element for authorised access point, it expects other data to express this.

Example data sources: http://www.rdatoolkit.org/sites/default/files/rsc_rda_complete_examples_bibliographic_april2016.pdf

http://www.rdaregistry.info/Examples/exRSCFullScore.html

http://rballs.info/topics/m/rdaex/rdaexScore.html

Our live blogger had to leave before the end of Gordon’s presentation, and the open discussion.  The day ended with a lively chat regarding the way forward with linked data in libraries, and how we can move from experiments and projects to more fully fledged services and infrastructure, with national libraries and other bodies needing to fulfil leadership and enabling roles.

Audience feedback was very positive, with rapturous applause for speakers, and the discussion carried on in a nearby hostelry, where current and future solutions and ideas for future events were mulled over.  All in all, a very successful event!

20160912_123228(ORCID remote presentation from Dr Torsten Reimer – which went without a glitch!)

C&I Call for Papers

Issue 184 (Sept) will feature current and recent research on cataloguing and metadata issues.  If you have conducted some research and would like to share your findings, or perhaps have recently completed a research degree or project and would like to summarise your methods and results, we will be pleased to hear from you.  We welcome contributions from students as well as practitioners.

Articles should be about 1500-2000 words, and the final date of submission is 12th September 2016, please see our website for further details or contact the editors, Helen Garner and Karen Pierce for more information:  H.J.Garner@shu.ac.uk and PierceKF@Cardifff.ac.uk