Transforming Humanities Texts into Data: Optical Character Recognition (OCR) for Starters

October 03
2-3pm
Online

Hosted by NYU Libraries

This workshop is intended for researchers who are looking to perform analysis on literary or historical texts in print format, but first needs to create a computer-readable version of that text using Optical Character Recognition (OCR). This class will demonstrate for participants the essentials on how to do this, with hands-on exercises using free, open-source software (Tesseract). It is ideal for beginners, particularly those interested in the Digital Humanities. Among the topics covered will be photographic capture for OCR, selection of best OCR software for a variety of applications, how to use OCR software, and special topics like non-English languages.  Participants will need a laptop with access to a basic browser to participate in the hands-on portion.

Registration is required. There are 90 seats available