Overview#
I created a program to generate TEI/XML files containing OCR results from IIIF manifest files. This article explains how to use it.

How It Works#
By specifying the URL of an IIIF manifest file, it creates a TEI/XML file containing OCR results from NDL Kotenseki OCR-Lite.
https://github.com/ndl-lab/ndlkotenocr-lite
Usage#
Access the following notebook:
https://colab.research.google.com/github/nakamura196/000_tools/blob/main/IIIFマニフェストファイルからTEI_XMLファイルを作成するプログラム.ipynb
Then press the first play button.

Once complete, update the manifest_url and output_dir values in the “Execute” section and run the cell.
The TEI/XML file containing OCR results will be output to output_dir.

Output Example#
A file containing OCR results per page and per line is created as follows:
Summary#
There may be some incomplete aspects, but I hope this serves as a useful reference.