GLOBALISE Ground Truth for Handwritten Text and Layout Recognition
This dataset contains Ground Truth PageXML files that were used to finetune the GLOBALISE Handwritten Text Recognition, baseline detection and region detection models (see Related Publications). This collection includes a datasheet with comprehensive details about the motivation for creating this dataset, the files it comprises, and their potential uses. Additionally, it contains guidelines for creating text region Ground Truth. The transcription Ground Truth files were created in accordance with the guidelines of the Dutch National Archives.
- Publisher
- GLOBALISE project
- Creator
- GLOBALISE project
- License
- CC BY-SA 4.0 International
- Temporal Coverage
- 1610/1796
- Languages
- nl, en
- Keywords
- handwriting recognitionground truth
- Issued
- May 2, 2024
- Modified
- May 2, 2024
Distributions Distributions provide access to the data.
application/zip
Validation_General_Missives_B1_V_0_5_(17-03-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26244 File Size: 96.0 KB
application/zip
Training_General_Missives_B1_0_5_(17-03-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26239 File Size: 439.9 KB
application/zip
Training_Regions_Standard_Layout_B4_V3.zip
https://datasets.iisg.amsterdam/api/access/datafile/33326 File Size: 499.7 KB
application/zip
Training_Regions_1001_B5_V1_26-6-23.zip
https://datasets.iisg.amsterdam/api/access/datafile/26247 File Size: 802.3 KB
application/pdf
Guidelines_Text_Region_GT.pdf
https://datasets.iisg.amsterdam/api/access/datafile/33327 File Size: 895.9 KB
application/zip
Training_Limited2_B2_v_1_1_(17-3-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26241 File Size: 511.3 KB
application/zip
Training_Baselines_1-1500_B6_V1_03-07-23.zip
https://datasets.iisg.amsterdam/api/access/datafile/26243 File Size: 130.2 KB
application/zip
Validation_All_Random_B2_v1_1_(17-3-2023).zip
https://datasets.iisg.amsterdam/api/access/datafile/26246 File Size: 411.5 KB
application/pdf
Registration
- Registered URL URL where the dataset is described.
- Registered on Date at which the URL was registered with the Dataset Register.
- July 24, 2024 at 09:04 AM
- Last read When the URL was last read by the Dataset Register.
- 20 hours ago
- Quality Rating Indicates how complete the dataset description is based on the metadata properties provided, such as title, description, license, and publisher.
- 100%