Aufbau eines produktiven Dienstes für die automatisierte Inhaltserschließung an der ZBW

Ein Status- und Erfahrungsbericht

Authors

DOI:

https://doi.org/10.5282/o-bib/5903

Keywords:

Subject indexing, Automation, Machine learning, Metadata, IT infrastructure, Human resources, Human in the loop

Abstract

Since 2016, ZBW – Leibniz Information Centre for Economics has been conducting their own research in the area of machine learning with the goal to develop viable solutions for automated or machine assisted subject indexing in-house. In 2020, a team at ZBW started designing and implementing a suitable software architecture in order to transfer these prototypical solutions into a productive service and to integrate it into the existing metadata systems and workflows. Both the applied research and the software development necessary for this endeavour (dubbed “AutoSE”) are executed by an organizational unit of the library department of ZBW, are continually pushed forward following the state of the art and benefit from a close communication with the staff responsible for intellectual subject indexing. This article reports on the milestones that the AutoSE team has reached over the last two years with respect to the implementation and the integration of the software and outlines those that are yet to be delivered until the end of the pilot phase (2024). The architecture is based on open source software and its machine-learning-based components are developed in close communication with the National Library of Finland (NLF) and, where possible, adapted to be integrated into NLF’s open source toolkit Annif. The operating model of the AutoSE service includes periodical reviews of individual components and of the productive workflow in its entirety and allows continuous improvements of the architecture. One of the results to be delivered by the end of the pilot phase is a documentation of the requirements for running the productive service on a permanent basis so that the necessary resources can be secured. This practical example shows which conditions have to be met by an institution in order to successfully use machine learning solutions such as the ones offered in Annif for subject indexing.

References

Bartz, Christopher: Software Architecture for the Automatization of Subject Indexing. Vortrag bei der ELAG am 08.06.2022 in Riga, Litauen. Online: https://elag2022.lnb.lv/programme/schedule/, Stand: 30.09.2022.

Beckmann, Regine; Hinrichs, Imma; Janßen, Melanie u.a.: Der Digitale Assistent DA-3 – eine Plattform für die Inhaltserschließung, in: o-bib – das offene Bibliotheksjournal 6 (3), 2019, S. 1–20. Online: https://doi.org/10.5282/o-bib/2019H3S1-20.

Busse, Frank; Grote, Claudia; Jacobs, Jan-Helge u.a.: Erschließungsmaschine gestartet, 04.05.2022, https://blog.dnb.de/erschliessungsmaschine-gestartet/, Stand: 30.09.2022.

Monarch, Robert M.; Manning, Christopher D.: Human-in-the-loop machine learning – active learning and annotation for human-centered AI. (E-Book), Manning Publications, 2021. Online: https://livebook.manning.com/book/human-in-the-loop-machine-learning/, Stand: 30.09.2022.

Kasprzik, Anna: Get everybody on board and get going – the automation of subject indexing at ZBW [Artikel], in: 87th IFLA World Library and Information Congress (WLIC), Satellite Meeting: Information Technology – New Horizons in Artificial Intelligence in Libraries, 2022. Online: https://repository.ifla.org/handle/123456789/2047.

Kasprzik, Anna: Get everybody on board and get going – the automation of subject indexing at ZBW [Folien]. Vortrag beim 87th IFLA World Library and Information Congress(WLIC), Satellite Meeting: Information Technology – New Horizons in Artificial Intelligence in Libraries am 22. Juli 2022 in Galway, Irland. Online: https://repository.ifla.org/handle/123456789/2047.

Seeliger, Frank; Puppe, Frank; Ewerth, Ralph u.a.: Zum erfolgversprechenden Einsatz von KI in Bibliotheken – Diskussionsstand eines White Papers in progress, in: b.i.t.online 24 (2 und 3), 2022, S. 173–178 (Teil 1) und S. 290–299 (Teil 2). Online: http://hdl.handle.net/11108/488 und http://hdl.handle.net/11108/490.

Toepfer, Martin; Seifert, Christin: Fusion architectures for automatic subject indexing under concept drift, in: International Journal on Digital Libraries 21, 2018, S. 169–189. Online: https://doi.org/10.1007/s00799-018-0240-3.

Toepfer, Martin; Seifert, Christin: Content-Based Quality Estimation for Automatic Subject Indexing of Short Texts Under Precision and Recall Constraints, in: Méndez, Eva; Crestani, Fabio; Ribeiro, Cristina u.a. (Hg.): Digital Libraries for Open Knowledge. TPDL 2018, Cham, 2018 (LNCS 11057). Online: https://doi.org/10.1007/978-3-030-00066-0_1.

Tochtermann, Klaus; Kasprzik, Anna: Auf Augenhöhe mit Forschungspartnern aus der Wissenschaft – Anwendung von Künstlicher Intelligenz in der ZBW, in: BuB – Forum Bibliothek und Information 74 (6), 2022, S. 306–311. Online: https://pub.zbw.eu/dspace/bitstream/11108/526/2/2022-Kasprzik-Tochtermann-Augenh%c3%b6he.pdf.

Winkler, Christian: Wer, wie, was. Textanalyse über Natural Language Processing mit BERT, heise online, 12.08.2020, https://www.heise.de/hintergrund/Wer-wie-was-Textanalyse-mit-BERT-4864558.html, Stand: 30.09.2022.

ZBW Mediatalk: KI in wissenschaftlichen Bibliotheken, Teil 1: Handlungsfelder, große Player und die Automatisierung der Erschließung, 17.08.2022, https://www.zbw-mediatalk.eu/de/2022/08/ki-in-wissenschaftlichen-bibliotheken-teil-1-handlungsfelder-grosse-player-und-die-automatisierung-der-erschliessung/, Stand: 30.09.2022.

ZBW Mediatalk: KI in wissenschaftlichen Bibliotheken, Teil 3: Voraussetzungen und Bedingungen für den erfolgreichen Einsatz, 31.08.2022, https://www.zbw-mediatalk.eu/de/2022/08/ki-in-wissenschaftlichen-bibliotheken-teil-3-voraussetzungen-und-bedingungen-fuer-den-erfolgreichen-einsatz/, Stand: 30.09.2022.

Published

2023-02-28

Issue

Section

Conference proceedings

How to Cite

Aufbau eines produktiven Dienstes für die automatisierte Inhaltserschließung an der ZBW: Ein Status- und Erfahrungsbericht. (2023). O-Bib. Das Offene Bibliotheksjournal Herausgeber VDB, 10(1), 1-13. https://doi.org/10.5282/o-bib/5903