Creating TEI/XML Files from IIIF Manifest Files Using NDL Kotenseki OCR-Lite

Overview This article introduces a Gradio app that creates TEI/XML files from IIIF manifest files using NDL Kotenseki OCR-Lite. It can be accessed at the following URL: https://nakamura196-ndlkotenocr-lite-iiif.hf.space/ Background This is a continuation of the following articles: Previously, two separate apps were needed, but with this update, the entire conversion process can be completed within a single Gradio app. Additionally, issues such as difficulty tracking progress when processing manifest files with many image pages, and the inability to copy processing results, have been fixed. ...

June 12, 2025 · 1 min · Nakamura

Creating Annotated IIIF Manifest Files and TEI/XML Files Using NDL Klasseki OCR-Lite

Notice I have created a more accessible article explaining the workflow introduced in this article. Please also refer to the following. Overview I would like to introduce a prototype tool for creating annotated IIIF manifest files and TEI/XML files using NDL Klasseki OCR-Lite. Creating Annotated IIIF Manifest Files First, I created a Gradio app that takes an IIIF manifest file as input and outputs an annotated IIIF manifest file using NDL Klasseki OCR-Lite. It is published using Hugging Face Spaces. ...

May 27, 2025 · 45 min · Nakamura

Created a Similar Text Search App for the Koui Genji Monogatari

Overview I created a similar text search app for the Koui Genji Monogatari. You can try it from the following URL. https://huggingface.co/spaces/nakamura196/genji_predict This article introduces how to use the app. Data The text data published on the following Koui Genji Monogatari DB is used. https://kouigenjimonogatari.github.io/ How the App Works The mechanism is simple: text for each volume and page of the Koui Genji Monogatari is prepared in advance, the edit distance from the input string is calculated, and texts (along with volume and page numbers) with high similarity are returned. ...

January 29, 2025 · 11 min · Nakamura

Creating Apps with Azure OpenAI Assistants API Using Gradio and Next.js

Overview I created apps using the Azure OpenAI Assistants API with Gradio and Next.js, so here are my notes. Target Data I used articles published on Zenn as the target data. First, I bulk downloaded them with the following code. i f i f p u w f m r m r a r h o p o p o g l i r o m o m e s l r r e u r d a i f p u t i r s h t o w t b t t = = r e a r f o a r e f e o t x s i s q 1 l s t t r g l x s u m t . t r 4 o d 1 [ : p a i l e t o p p l m h e s m ] = o c e b a u i _ s c o = a f q i n = l n r r r + n o . o n = = k o . u m i f s e ( e t l = p p n s h e p w e p m " e r s a a i s t a a t e B s t d e r s o p h e r k c . 1 q t t i e o m i n i t r o t = s = t l a d h h n = a u l r ( t s t r t p i e p m . u u p . s t e t p r o d c p ( = e e r t . g ( e ( B s e n a l i e u x e i f e o x t e t : q s t e n n r f i q f i t s t x a q / u e a s d l " s u u n _ . _ t u d / e . [ ) a ( s d t e l d t p o ) t m z s j ' r " ) a s s S ( e a p i e t s a = t h : t ( t o c x t a f n s o r = i t a t s u l t h t u n . n t c t / e . p a ( . h l . g ( i 0 l p t x g ( s ) d , S d e ) c : e s e t e r s i o e t l s : x _ t e _ r " u v ( e : / t ( s = n w p / u s / p u p " a " a r ' z { a r o z m ) p l ] e u t l n n e i ) n r h ) s c ( a / n l ) e " t s a . . : . ) e r d s t x f t e p e t : i v l x _ c " i t o l t , p e + ( a s ' " t ? a / h h u r ' t ) s t ) m , e i [ l r c - . e n l 1 p x a e ] a i m [ } r s e ' . s t = p t e _ n a x r o a t t " k k h " ) = a ' T m ] r u ) u r e a ) 1 9 6 & p a g e = { p a g e } " Registering to the Vector Store Upload data files with the following code. ...

January 6, 2025 · 13 min · Nakamura

Building an NDLOCR Gradio App Using Azure Virtual Machines

Overview In the following article, I introduced a Gradio app using Azure virtual machines and NDLOCR. This article provides notes on how to build this app. Building the Virtual Machine To use a GPU, it was necessary to request a quota. After the request, “NC8as_T4_v3” was used for this project. Building the Docker Environment The following article was used as a reference. https://zenn.dev/koki_algebra/scraps/32ba86a3f867a4 Disabling Secure Boot The following is stated: ...

December 23, 2024 · 16 min · Nakamura

Created a Gradio App to Try ndlocr_cli (NDLOCR ver.2.1) Application

Overview I created a Gradio app that allows you to try the ndlocr_cli (NDLOCR ver.2.1) application. Please try it at the following URL. https://ndlocr.aws.ldas.jp/ Notes Currently, only single image uploads are supported. I plan to add options such as PDF upload functionality in the future. It uses the “NVIDIA Tesla T4 GPU” installed in the “NC8as_T4_v3” VM available on Azure. Summary I’m not sure how long I can continue providing this in its current form, but I hope it will be useful for verifying the accuracy of the ndlocr_cli (NDLOCR ver.2.1) application. ...

December 22, 2024 · 1 min · Nakamura

Building a RAG-based Chat Using Azure OpenAI, LlamaIndex, and Gradio

Overview I tried building a RAG-based chat using Azure OpenAI, LlamaIndex, and Gradio, so here are my notes. Azure OpenAI Create an Azure OpenAI resource. Then, click “Endpoint: Click here to view endpoint” to note down the endpoint and key. Then, navigate to the Azure OpenAI Service. Go to “Model catalog” and deploy “gpt-4o” and “text-embedding-3-small”. The result is displayed as follows. Downloading the Text This time, we target “The Tale of Genji” published on Aozora Bunko (a free digital library of Japanese literature). ...

December 16, 2024 · 16 min · Nakamura

Building a Gradio App Using NDL Kotenseki OCR-Lite

Overview I built a Gradio App using NDL Kotenseki OCR-Lite. You can try it at the following URL. https://huggingface.co/spaces/nakamura196/ndlkotenocr-lite “NDL Kotenseki OCR-Lite” provides a desktop application, so an execution environment is available without the need for a web app like Gradio. Therefore, the intended use cases for this web app include usage from smartphones or tablets, and integration via web API. Development Notes and Bug Fixes Using Submodules The original ndlkotenocr-lite was introduced as a submodule. ...

December 4, 2024 · 20 min · Nakamura

Inference App Using a YOLOv5 Model (Character Region Detection)

Overview The character region detection app is published at the following link. https://huggingface.co/spaces/nakamura196/yolov5-char The above app had stopped working, so I fixed it following the same procedure as in the following article. The model used in this app was built using the “Japanese Classical Character Dataset” (held by NIJL and others / processed by CODH) doi:10.20676/00000340. I also made some minor improvements during this fix, which I will introduce here. ...

May 23, 2024 · 4 min · Nakamura

Fixing an Inference App Using Hugging Face Spaces and a YOLOv5 Model (Trained on NDL-DocL Dataset)

Overview In the following article, I introduced an inference app using Hugging Face Spaces and a YOLOv5 model trained on the NDL-DocL dataset. This app had stopped working, so I fixed it to make it operational again. https://huggingface.co/spaces/nakamura196/yolov5-ndl-layout Here are my notes on the changes made during this fix. Changes The modified app.py is shown below. i f i i m d i o ] t d a e ] d d m r m m o e n u i e r x e e p o p p d f p t t s t a m m o m o o e u p l c i m o o r r r l y r d r i # o r ] t u g g e r c p [ [ [ . t P t t o e f e m u e s t r r i l l ' ' ' = l I = l s s _ C t t s . . = p e e 『 『 『 a g L y j o u = w o p u = I J t s 源 源 平 g u r o s y ( l = i n u r o r = m S " i = 氏 氏 家 r n a i l o o i t r t v t n u e g a O Y o = 物 物 物 . c d m o n l m s e j h e _ t s r [ g N O n " 語 語 語 I h i p v o ) s s _ r i [ p . e ( L < [ 』 』 』 n ( o o 5 v : = u o b t m u I ( ) O = p ( ( ( t s r 5 l n o a t m t v 東 京 国 e h a t . m t . x t g _ a y 5 " s 京 都 文 r a s l o s l e h e i g p Y t 大 大 学 f r I o d . o s e m e e N O y 学 学 研 a e g m a e p a = a ( = D L l 総 所 究 c = r a d l a d = n g t " L O e 合 蔵 資 e F g ( ( n s u I e y p - v = 図 ) 料 ( a e " i d ( r m m , p i D 5 ' 書 . 館 y l n m a d e p a e l o t 館 j 提 o s a ) s f s y g = " c N e 所 p 供 l e k ( ) u e ' , L D x 蔵 g ) o ) a ) l a . p L t ) ' . , m # . t r f i l D - - . ] j u x s r r l a a D a j , p i r i y . a o ' b t o l p g n a n x r y m , e a c i g ' p 1 f y e a l s L g ' ] u 9 e [ n b r l = e n ] t 6 r 0 d a r a " t D : , s / e ] e c a b O s a , y n . r k y e u " t c o c t ( ( l t a e o l e o ) t i = p s n u o _ [ o m " u e t t v j 0 _ O t t e p 5 s ] a w r s r u - o n i i I ' t n n t g m G > s d ( # i h i a r Y , l o m _ n g a O - r r a b a e d L t l i e g o l " i O i a e s e x ) o v t y n u e I , 5 l o t l s m d e u = t ) a e N = t " s g m D t " r . e o L i ) e r " - t c e ) f D l o n o o e r d r c , d e L s r o d " ( b D e ) ) j a s e t c r c a r e t s i t e p u d t t r e s i n t o s e i n c s = a t d i a e l o n s i n c s . o r t b i U j p o p e t f l c i o t o i a n m d d , a e g a t a e n e r s c t i t i m i c a o l g n e e = m a o o r r d t e i c l c l l i t e c r , k a i e a n x n e a d m e p x o l a n e m s p t = l h e e e x a i < m m a p a l g h e e r s e ) t f = u " s h e t . t " p s : / / g i t h u b . c o m / n d l - l a b / l a y o u t - d a t a s e t \ " > N D L - D o c L D a t a s e t s < / a > . < / p > " First, due to Gradio version upgrades, I changed gr.inputs.Image to gr.Image and similar updates. ...

May 20, 2024 · 6 min · Nakamura

LlamaIndex+GPT4+gradio

Overview I had the opportunity to use LlamaIndex, GPT4, and gradio together, so this is a memo of the process. Since the text used was small in size, the results are accordingly modest, but I prototyped a chatbot for Shibusawa Eiichi. Background I referred to the following article. https://qiita.com/DeepTama/items/1a44ddf6325c2b2cd030 Based on the above, I made modifications to work with libraries as of April 20, 2024. The notebook is published at the following location. ...

April 20, 2024 · 1 min · Nakamura

Using the API of the Curriculum Guidelines Code Recommendation App

Overview In the following article, I introduced a recommendation app for Japan’s Curriculum Guidelines (Gakushu Shido Yoryo) codes. This time, I introduce how to use the above recommendation app via Gradio’s API. Usage Install the library. p i p i n s t a l l g r a d i o _ c l i e n t For example, let’s use the following data. ...

April 16, 2024 · 6 min · Nakamura

Prototype of a Course of Study Code Recommendation App

Overview I created a Course of Study code recommendation app, and this is an introduction to it. You can try it on the following Hugging Face Space. It utilizes the Course of Study LOD. https://huggingface.co/spaces/nakamura196/jp-cos Usage Enter any text in the text form. “School Type” is an optional field. Results are displayed on the right side of the screen. Sample inputs are also provided, so please try them out. Information from NHK for School is used. ...

April 16, 2024 · 3 min · Nakamura