Overview

NDLTSR (NDL Table Structure Recognition) is described as follows.

A program for recognizing the structure of tables contained in document images is publicly available. By combining it with OCR text data with coordinates, it can be used to structure text data contained in tables. Reference (external link): Addition of new functionality (table structuring) to the Next Generation Digital Library and publication of source code and dataset for the new functionality. This program enables inference of table structures using a machine learning model trained on the NDLTableSet published by the National Diet Library, and also allows retraining with user-provided datasets using the same method as LORE-TSR (external link).

Notebook

I prepared the following notebook.

https://colab.research.google.com/github/nakamura196/000_tools/blob/main/NDLTSR.ipynb

You can try it from “Runtime” > “Run all cells” as shown below.

Results

Recognition results like the following are output.

Reference

Taiwan Government-General Agricultural Experiment Station, ed. “Taiwan Government-General Agricultural Experiment Station Bulletin” No. 197: Effects of Irrigation on Growth, Yield, and Quality of Sweet Potatoes, Taiwan Government-General Agricultural Experiment Station, 1940-1944. National Diet Library Digital Collection https://dl.ndl.go.jp/pid/1046122 (accessed 2024-04-26)

https://dl.ndl.go.jp/pid/1046122/1/1

Note

As officially explained below, please note that this program only provides table structure information.

What this program can infer is table structure information (numerical information representing the rectangular coordinates of each cell and relationships between cells), so the program alone cannot be used as an OCR processing program. To perform processing similar to the Next Generation Digital Library functionality, a separate OCR processing program that outputs text data with coordinates is required.

Summary

I hope this serves as a useful reference for trying NDLTSR (NDL Table Structure Recognition).