OCR Report
View/ Open
Date
07/2023Author
Charlton, Ash
Metadata
Abstract
Optical Character Recognition (OCR) is the most commonly known method of text extraction from digitised documents used in the cultural heritage sector. It is a process that transforms images of text into a machine-readable format. Traditionally, OCR uses technology to digitally scan text and identify letters individually, therefore recognising one character at a time. Advancements have been made over time that introduce aspects of machine learning into OCR which change this dynamic slightly, which will be explored in more detail later in this report. This report explores OCR software options broadly, in addition to past, current and future proposed OCR processes and workflows that the University of Edinburgh library may introduce.