The International Image Interoperability Framework (IIIF) 2017 Conference in The Vatican is intended for a wide range of participants and interested parties, including digital image repository managers, content curators, software developers, scholars, and administrators at libraries, museums, cultural heritage institutions, software firms, and other organizations working with digital images and audio/visual materials. The conference will consist of two events with separate registration:
IIIF Conference, 7-9 June (3 days of plenary and parallel sessions). The pre-conference Mirador Viewer and Universal Viewer group meetings will take place on Monday, June 5, prior to the Showcase event and conference.
Google has revolutionized the world of large-scale digital libraries by providing page-level content analysis, through the use of optical character recognition, giving users the ability to search for, and retrieve individual page images. However, the Google approach, and similar efforts, has required direct access to and hosting of the digital images, which requires expensive data storage and management solutions. In this talk we will describe our efforts at building distributed document image analysis systems. This approach uses the IIIF Image API as a means of accessing and retrieving remote images from several institutions, then applying document recognition techniques, such as optical character recognition, to them. The results of these processes are stored in search and retrieval systems, while the IIIF Image URL is used as the locator for these images. These retrieval systems are then capable of performing content-based search on the collections, while the IIIF Image URL is used to retrieve, from the host institution, the original image. With this approach, the intellectual contents of image collections across institutions can be brought together in purpose-built retrieval systems. Use cases include optical character recognition, but could also include manuscript illumination identification, scribal identification, woodcut similarity, or any number of other image analysis techniques. Using distributed document image recognition, bespoke search engines for institutional collections can be made and supported by third parties, giving broader cross-institutional content-level access to collections.