Important Usage Notice
The system introduced in this article may place load on external servers. Please exercise caution when using it.
- Server Load: Parallel requests place load on target servers
- DoS Attack Risk: Large numbers of simultaneous accesses may be mistaken for DoS attacks
- Recommended Approach: It is recommended to download images locally in advance and run only the OCR processing in parallel
- Check Terms of Use: Always check the terms of use for target servers and obtain prior permission if necessary
- Appropriate Rate Limiting: For production use, conservative concurrency settings (around 5-10 parallel) are strongly recommended
- Responsible Usage: Be considerate of server administrators and other users
This article is a record of a technical proof of concept. We ask readers to use it responsibly.
Introduction
This article introduces a case study of building a scalable OCR processing system on Azure Container Apps, leveraging NDL Classical Japanese OCR Lite developed by the National Diet Library (NDL). We explain the design and implementation of a system that achieves pay-as-you-go billing and auto-scaling through cloud-native architecture.
System Overview
Architecture
Key Components
- OCR Engine: NDL Classical Japanese OCR Lite (specialized for Japanese classical texts)
- Infrastructure: Azure Container Apps (serverless containers)
- API Design: REST API (Image URL to OCR result)
- Output Format: TEI P5 compliant XML
- Scaling: Automatic scaling based on demand
Features of NDL Classical Japanese OCR Lite
OCR Optimized for Japanese Classical Texts
- Vertical Layout Support: Vertical text document structures specific to classical texts
- Reading Order Optimization: Japanese reading order from right to left, top to bottom
- Classical Character Recognition: Support for cursive script and variant kana
- Lightweight Implementation: Cloud deployment ready through Docker containerization
Reasons for Choosing Azure Container Apps
Benefits of Serverless Containers
Cost Optimization
- Pay-as-you-go: Charged only for what you use
- 0 Replicas: Completely zero cost when idle
- Auto-scaling: Resource adjustment based on demand
System Implementation
Server-Side Implementation
Reading Order Algorithm
TEI XML Output
Processing Results Example
Small-Scale Test Processing (Kiritsubo)
- Target: “Kiritsubo” held by the University of Tokyo
- Pages: 32 pages
- Processing Time: Approximately 30 seconds
- Success Rate: 100%
- Concurrency: 10 parallel
- Cost: Approximately $0.05
Performance Characteristics
Technical Features of the System
1. Cold Start Handling
2. Externalized Configuration
3. Swagger UI Integration
Deployment
Azure Container Apps Deployment
Dockerization
0
Operations and Monitoring
Performance Metrics
- Response Time: Average 2-3 seconds/image
- Throughput: 10-15 images/second (with 20 replicas)
- Success Rate: Over 99%
- Cost Efficiency: $0 when idle, charged only during processing
Log Monitoring
1
Future Prospects
Technical Improvements
- Image Caching: Reduction of duplicate processing
- Batch Processing: Efficient large-scale processing
- GPU Support: Faster OCR processing
- Enhanced Metrics: Detailed performance analysis
Application Possibilities
- Digital Archives: Utilization in libraries and museums
- Research Support: Digitization for humanities research
- Education: Creating teaching materials from classical literature
- Cultural Preservation: Digital preservation of valuable materials
Summary
By combining NDL Classical Japanese OCR Lite with Azure Container Apps, we were able to build a classical text OCR system that achieves both cost efficiency and scalability. The serverless architecture enables pay-as-you-go billing and auto-scaling, making it practical as a digital humanities tool.
Key Points
- Cost Optimization: Charged only during use
- Auto-scaling: Resource adjustment based on demand
- TEI P5 Compliant: Standardized XML output
- Classical Text Specialized: OCR optimized for Japanese classical texts
- API Design: Simple and extensible design
This system was developed as a technical proof of concept. For production use, please give sufficient consideration to the load on target servers and comply with appropriate rate limiting and terms of use.