GLOBALISE Ground Truth for Handwritten Text and Layout Recognition

This dataset contains Ground Truth PageXML files that were used to finetune the GLOBALISE Handwritten Text Recognition, baseline detection and region detection models (see Related Publications). This collection includes a datasheet with comprehensive details about the motivation for creating this dataset, the files it comprises, and their potential uses. Additionally, it contains guidelines for creating text region Ground Truth. The transcription Ground Truth files were created in accordance with the guidelines of the Dutch National Archives.

Uitgever
GLOBALISE project
Temporele dekking
1610/1796
Talen
nl, en
Uitgegeven
2 mei 2024
Gewijzigd
2 mei 2024

Distributies

application/zip
Validation_General_Missives_B1_V_0_5_(17-03-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26244
Bestandsgrootte: 96.0 KB
application/zip
Training_General_Missives_B1_0_5_(17-03-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26239
Bestandsgrootte: 439.9 KB
application/zip
Training_Regions_Standard_Layout_B4_V3.zip
https://datasets.iisg.amsterdam/api/access/datafile/33326
Bestandsgrootte: 499.7 KB
application/zip
Training_Regions_1001_B5_V1_26-6-23.zip
https://datasets.iisg.amsterdam/api/access/datafile/26247
Bestandsgrootte: 802.3 KB
application/pdf
Guidelines_Text_Region_GT.pdf
https://datasets.iisg.amsterdam/api/access/datafile/33327
Bestandsgrootte: 895.9 KB
application/zip
Training_Limited2_B2_v_1_1_(17-3-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26241
Bestandsgrootte: 511.3 KB
application/zip
Training_Baselines_1-1500_B6_V1_03-07-23.zip
https://datasets.iisg.amsterdam/api/access/datafile/26243
Bestandsgrootte: 130.2 KB
application/zip
Validation_All_Random_B2_v1_1_(17-3-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26246
Bestandsgrootte: 411.5 KB
application/pdf
Datasheet.pdf
https://datasets.iisg.amsterdam/api/access/datafile/33328
Bestandsgrootte: 171.8 KB

Registratie

Geregistreerd op
24 juli 2024 om 09:04
Laatst gelezen
19 uur geleden
Kwaliteitsscore
100%