open imageSearch

Artificial intelligence for art collections

Together with the Collection of Prints and Drawings ETH Zurich, the ETH Library Lab is developing the application “open imageSearch” for content-based image searches. Positioned at the intersection between artificial intelligence and digital art history, this app aims to make the cataloguing and digitisation of graphic prints more efficient, while also promoting the publication of research data relevant to art history.

Background
Art history and visual studies are not possible without continued basic research, especially in the age of global art history. In concrete terms, this means that indexing works according to author, date and pictorial theme is crucial for interpreting their content.
As several prints are usually made from one printing plate, cataloguing has a significant advantage here: once scientifically verified metadata has been created for an object, it is also valid for further prints of the same plate and can therefore be adopted by other collections. This process requires the metadata for the works to be easily accessible in the corresponding collection catalogues. But as long as the search options there are limited to text and character fields, it is difficult or even impossible to find prints of works that have not yet been clearly identified or that have few specific identifiers. There-fore, an automated, content-based image search represents a valuable alternative as an additional search method.
As the number of digitised records in collections grows, and the amount of useful data shared by institutions online increases, the probability of identifying an unknown work greatly increases. These sources can be used for indexing unprocessed holdings in the long term.

Objectives
The open imageSearch application is being developed to assist museums, collections and research institutions that own graphic prints and want to digitise their holdings. The application will use a content-based image search to provide reliable search results of similar works from the partner institutions’ collections, which have been scientifically assessed. It can thus contribute to the partial automation of indexing and digitisation processes in art collections.

Using the open imageSearch web app to find metadata and related images in a series (B.Sunderland, CC BY-SA 4.0)

The web app does not require installation and is accessible in your web browser. A picture of the work you are looking for is entered as the search query and uploaded via a responsive user interface that enables the flexibility of working from both mobile and stationary devices. In addition, users can photograph a graphic print directly on site using their smartphones to start a search.

Behind the scenes, a convolutional neural network (CNN) evaluates the image and a k-nearest neighbours (KNN) algorithm checks the similarity to already known works. Works displayed in the search results can be identified by the accompanying summary metadata. By linking each search result to the detailed information for the work online, which is provided by the owners or a subject-specific search portal, further information and proof of sources of reliable quality which are necessary for cataloguing can be retrieved and any research work already performed can be reused more efficiently. In addition, users can easily share interesting search queries via a permanent link to make them accessible to others or to save them for later use. The focus on metadata access and exchange differentiates open imageSearch from other similar projects working in this area, while still making it a useful addition.

Overview of the image retrieval process (print: Paul Klee (1879), Artist, Rechnender Greis, 1928, https://doi.org/10.16903/ethz-grs-D_001318, animation: B.Sunderland, CC BY-SA 4.0)

In order to achieve the highest possible success rate for a query, the integration of further datasets is planned. This will improve the precision of the algorithm and continuously increase the benefits of metadata and image exchange. Thanks to an automated data processing pipeline, a large number of digital images can be added to the application in a very short time. In this way, collections can easily contribute their records to the network and quickly begin to help themselves and other groups save resources; in terms of time, staff and finances associated with identifying and cataloguing works. We are currently planning a collaboration with the “Deutsches Dokumentationszentrum für Kunstgeschichte – Bildarchiv Foto Marburg” (German Documentation Centre for Art History – Marburg Photo Image Archive) (graphics portal) as well as other institutions.

Project Duration

1. January 2021 bis 31. December 2021

Project Team

Mengqi Wang

Master Student of Data Science, University of Zurich

Ann-Kathrin Seyffer

Team Coordinator Collection Online, Graphische Sammlung ETH Zürich

Barry Sunderland

Technical Engineer ETH Library Lab, Alumnus Innovator Fellowship Program

Collaborators

Collection of Prints and Drawings ETH Zurich

References

[1] © Francine Mury (*1947), Ohne Titel [Zweige], Blatt aus “Il Giardino di Livia”, 2009 [Ausschnitt, bearbeitet]
https://doi.org/10.16903/ethz-grs-2009_0079_19