GLOBALISE Ground Truth for Handwritten Text and Layout Recognition

This dataset contains Ground Truth PageXML files that were used to finetune the GLOBALISE Handwritten Text Recognition, baseline detection and region detection models (see Related Publications). This collection includes a datasheet with comprehensive details about the motivation for creating this dataset, the files it comprises, and their potential uses. Additionally, it contains guidelines for creating text region Ground Truth. The transcription Ground Truth files were created in accordance with the guidelines of the Dutch National Archives.

Registration