How to Bulk Unpublish Hatena Blog Articles (AtomPub API)

When you want to bulk unpublish old articles after migrating your Hatena Blog articles to another site. Important Note: You Cannot Revert to Draft With the Hatena Blog AtomPub API, you cannot revert published articles back to draft. Sending <app:draft>yes</app:draft> via a PUT request results in a 400 Cannot Change into Draft error. Therefore, there are two approaches: Method 1: Replace the Article Body with “This Article Has Moved” You can rewrite the article’s <content> using the AtomPub API’s PUT method. ...

March 1, 2026 · 2 min · Nakamura

Trying "oitei" - An Automatic Conversion Tool from OpenITI mARkdown to TEI XML

Introduction In the OpenITI (Open Islamicate Texts Initiative) project, which handles historical texts from the Islamicate world, texts can be tagged using a lightweight notation called mARkdown instead of TEI/XML. While TEI/XML is a powerful international standard for structuring texts, it has problems with right-to-left (RTL) languages like Arabic, where mixing XML tags causes display issues in editors. mARkdown was designed to solve this problem. In this article, we will try running oitei, a Python tool that automatically converts mARkdown texts to TEI XML. ...

February 28, 2026 · 17 min · Nakamura

Complete Restoration of Deep Zoom Images: Converting Tile Images to BigTIFF

Introduction Deep Zoom technology is used to smoothly zoom and display high-resolution images on websites. There are cases where you need to restore the original high-resolution image from tiled image data generated by tools such as Microsoft Deep Zoom Composer. This article explains the technology for restoring original high-resolution TIFF images from image data published in Deep Zoom format. How Deep Zoom Images Work Tile Structure Deep Zoom images divide a single large image into multiple small tile images and store them in a pyramid structure: ...

November 18, 2025 · 14 min · Nakamura

Introducing GitHub File History Analyzer: A Tool for Analyzing File Edit History with AI

This article was created by AI. Introduction Have you ever wanted to analyze the edit history of files managed in a GitHub repository? There are cases where you want to understand change patterns of files that have been updated over a long period, or the evolution process of a project. GitHub File History Analyzer is a command-line tool developed to meet such needs. Tool Overview This tool provides the following features: ...

July 24, 2025 · 6 min · Nakamura

Fixing the 'ref' Bug in DHConvalidator

This article was partially written by AI. Overview DHConvalidator is a tool for converting Digital Humanities (DH) conference abstracts into a consistent TEI (Text Encoding Initiative) text base. https://github.com/ADHO/dhconvalidator When using this tool, the following error occurred during the conversion process from Microsoft Word format (DOCX) to TEI XML format: E R R O R : n u . x o m . P a r s i n g E x c e p t i o n : c v c - c o m p l e x - t y p e . 2 . 4 . a : I n v a l i d c o n t e n t w a s f o u n d s t a r t i n g w i t h e l e m e n t ' r e f ' This article shares the cause and solution for this issue. ...

June 27, 2025 · 23 min · Nakamura

Adding Normalization Rules in Archivematica's Preservation Planning

Overview This is a memo on how to add Normalization rules in Archivematica’s Preservation planning. Background When ingesting images with the .jpg extension into Archivematica, there were cases where tif files were not created for preservation, despite having a rule to create tif files for items with Format of JPEG as shown below. I checked the task details from the history screen shown below. The results were as follows. ...

April 24, 2025 · 2 min · Nakamura

Registering Objects Using the AtoM (Access to Memory) API

Overview This is a memo on how to register objects using the AtoM (Access to Memory) API. Enabling the API Access the following. /sfPluginAdminPlugin/plugins Enable arRestApiPlugin. Obtaining an API Key The following explains how to generate an API key. https://www.accesstomemory.org/en/docs/2.9/dev-manual/api/api-intro/#generating-an-api-key-for-a-user While it appears you can also connect to the API with a username and password, this time I issued a REST API Key. Endpoints AtoM provides multiple menus such as “Authority records” and “Functions,” but it appears that only the following are available via the API. ...

March 12, 2025 · 19 min · Nakamura

How to Convert Word Files to TEI XML: A Guide to Using the TEIgarage API

This article was created by AI with some human modifications. Introduction In the world of digital humanities, it has become common to store documents in TEI (Text Encoding Initiative) format. TEI is a standard for structuring scholarly texts. This article explains how to convert documents created in Microsoft Word to TEI XML format using Python. What is TEIgarage? TEIgarage is an online service for converting documents in various formats to TEI XML. The service provides an API that can be called directly from programs. In this article, we will call this API from Python to convert Word files. ...

March 3, 2025 · 8 min · Nakamura

Registering Data with Drupal's JSON:API Using Username and Password

Overview In the past, I wrote articles about registering data using Drupal’s JSON:API with Python. The following uses Basic authentication. And the following uses an API Key. In addition to these methods, I was able to register data using regular login authentication, so this is a memo of that process. Code The code is as follows. It logs in, obtains a CSRF token, and then registers content. i i i f c m m m r l p p p o a o o o m s r r r s d d d d t t t d e e e e o A f f f f r j o t p e s s e i l # s # # # s s l # l l i g # c i e c # u r i e q o n C _ o e e e o o o f e s f l r r e f l u n v l i a D l エ s 認 l l g ロ g g t C r s e 記 l s s e i n d r f ン e 証 f f i グ i i l _ S f c e a 事 p r e s i e i _ u . ド l 情 . . n イ n n l j h o s c R _ f c s # # s } : # s t 作 = o # u h c j e p : p t m n t d p D ポ f 報 U P ( ン _ _ o s e g e s F t " o r e e e 成 n r e o s s r r s p t _ o a R イ . ( S A s リ u r g o a i l r ト o { o f r s l r l _ リ f s s l a o o p i i o : _ t l U ン J B E S e ク r e i n d n f f ー k s k _ e e f a f c ク " e e , d k n o n n r ( e サ P ト S a R S l エ l s n = e _ . _ ク e e i t t l . " " " i . o エ { l e i = n t t t s n イ A ( O s N W f ス p _ { r r s t ン n l e o u f h C A X s c n ス s = f r e d s ( ( e v ト L J N i A O ) ト = o u " s e e o を _ f s k r . e o c - e s t ト e . s s a e " " l l ( の _ S A c M R : n r n = s s k 取 r . = e n c a n c C r e l r J = = t . コ エ o f o U B O P 認 E D f s l a { p s e 得 e D s n s d t e S E f n f e S s s a s ン ラ a ) v R A N I 証 " e , m " o i n s R e _ c r e e p R x _ t . q O e e t テ ー d : e L S : _ ) = = { e C n o ( p U l r s f r n t F c t ( D u N l l a ン : _ r ( E A E s = " o s n s o P f e r _ s t " - e o s R e A f f t ツ " d r 例 _ P N o o e : n e _ e n A . s f t - : T p k e U s P . . u が , o i ) U I D s s l r t . c l s L s p _ o = T o t e l P t I h s s 作 t d R ) P . . f e s e s o f e _ e o t k y " k i n f A s _ e e _ 成 r e e L O g g . q e n t o ) B s n o e { p a e o , L . E a s c さ e n = I e e D u l t a k : = A s s k n e p n n = _ p N d s o れ s v T = N t t R e f - t i S i e e " p " ( d B o D e i d ま p r T e e U s . T u e r E o . n = : l : f N a A s P r o e し o u o n n P t U y s s e _ n s _ i " o t S t O s n た n e s = v v A s S p _ q U _ t r c " c c C n a E ( I , _ = ! s ) . ( ( L . E e c = u R c a e s a a s S e : _ N c = " e g f " " _ p R " o e L o t s r p t r R U T o ) . e " U P B o N : d l s } u p f p i f F d R , o 2 s t { S A A s A e o t k s o _ l o _ ト i L k 0 t e s E S S t M " g s s i _ n t i n t ー c } i 1 a n e R S E ( E a = i . e e c s o c / o ク t / e : t v l N W _ , p = n g s s o e k a v k ン ) j s u ( f A O U p _ e s d . e t n e 取 : s , s " . M R R " l 2 r t i e t n i d n 得 o _ D D E D L p i 0 e ( o # e _ o . _ 失 n c R R " " } a c 0 s n = x r n a r 敗 a o U U ) ) / s a : p / こ = t e / p e : p d P P u s t o t こ s v i s i e A A s " i n o で 2 p n + p { / , L L e : o s k ロ 0 o d j o c { _ _ r n e e グ 0 n . s n s d r B B / s / . n イ : s a o s r a e A A l e j c " ン e p n e f t s S S o l s o , セ . i " . _ a p E E g f o o ッ t + , t t [ o _ _ i . n k シ e j e o ' n U U n P " i ョ x s x k d s R R ? A } e ン t o t e a e L L _ S s を n , n t . " } f S 渡 " _ a t ) / o W す , r ' e j r O e ] x s m R s [ t o a D p ' ) n t } o t a = , n y p j s p i s e e / o . ' n n s ] o " t . d a r e t e / u p a s l r _ a t c c i o e c d ( l e ' e } " { ' c , s r f _ t ) o } k " e n _ r e s p o n s e . t e x t } " ) With this, content can be registered as follows. ...

March 1, 2025 · 9 min · Nakamura

How to Get Coordinates of Sub-Images from a Larger Image

Overview I had an opportunity to obtain the coordinates within a larger image from multiple cropped sub-images. This article is a memo summarizing the method for doing this. I introduce a method using OpenCV’s SIFT (Scale-Invariant Feature Transform) to perform feature point matching between template images and the original image, estimate the affine transformation, and obtain the coordinates. Implementation Required Libraries p i p i n s t a l l o p e n c v - p y t h o n n u m p y t q d m Python Code The following code matches template images (PNG images in templates_dir) against a specified large image (image_path) using SIFT, and obtains the coordinates within the original image. ...

February 23, 2025 · 14 min · Nakamura

Creating TEI/XML from VTT Files

Overview This is a memorandum on how to create TEI/XML files from VTT files. Additionally, I will make it possible to access VTT files and TEI/XML files from an IIIF manifest. As a result, as shown below, the TEI/XML file is associated via SeeAlso, and the contents of the VTT file can be accessed from the “Annotations” tab. https://clover-iiif-demo.vercel.app/?manifest=https://movie-tei-demo.vercel.app/data/sdcommons_npl-02FT0102974177/sdcommons_npl-02FT0102974177_vtt.json References I referenced the following efforts from “The Ethiopian Language Archive.” The TEI/XML structuring method was particularly helpful. ...

February 21, 2025 · 33 min · Nakamura

Created a Similar Text Search App for the Koui Genji Monogatari

Overview I created a similar text search app for the Koui Genji Monogatari. You can try it from the following URL. https://huggingface.co/spaces/nakamura196/genji_predict This article introduces how to use the app. Data The text data published on the following Koui Genji Monogatari DB is used. https://kouigenjimonogatari.github.io/ How the App Works The mechanism is simple: text for each volume and page of the Koui Genji Monogatari is prepared in advance, the edit distance from the input string is calculated, and texts (along with volume and page numbers) with high similarity are returned. ...

January 29, 2025 · 11 min · Nakamura

Creating AIPs with Archivematica for Files in Alfresco

Overview This is an example of how to create AIPs using Archivematica for files in Alfresco. Below is a demo video of the deliverable. https://youtu.be/7WCO7JoMnWc System Configuration For this project, I used the following system configuration. There is no particular significance to using multiple cloud services. Alfresco was built on Azure, referencing the following article. Archivematica and object storage use mdx.jp, and the analysis environment uses GakuNin RDM. ...

January 26, 2025 · 27 min · Nakamura

How to Upload Media to Omeka S Using Python

Overview This is a personal note on how to upload media to Omeka S using Python. Preparation Prepare environment variables. O O O M M M E E E K K K A A A _ _ _ S S S _ _ _ B K K A E E S Y Y E _ _ _ I C U D R R E E L N D = T E h I N t T T t Y I p = A s L : = / / d e v . o m e k a . o r g / o m e k a - s - s a n d b o x # E x a m p l e Initialize. ...

January 3, 2025 · 7 min · Nakamura

Trying Out Geocoding Libraries

Overview I had the opportunity to try out geocoding libraries, so this is a memo. Target This time, we will use the following text as an example. 岡山市旧御野郡金山寺村。現在の岡山市金山寺。市の中心部からは直線で北方約一〇キロを隔てた金山の中腹にある。 (Translation: Okayama City, former Mino-gun Kinzanji village. Currently Kinzanji, Okayama City. Located on the hillside of Kinzan, approximately 10 kilometers north in a straight line from the city center.) Tool 1: Jageocoder - A Python Japanese geocoder First, let’s try “Jageocoder.” ...

December 3, 2024 · 7 min · Nakamura

Notes on LLM-Related Tools

Overview This is a memo on tools related to LLMs. LangChain https://www.langchain.com/ It is described as follows. LangChain is a composable framework to build with LLMs. LangGraph is the orchestration framework for controllable agentic workflows. LlamaIndex https://docs.llamaindex.ai/en/stable/ It is described as follows. LlamaIndex is a framework for building context-augmented generative AI applications with LLMs including agents and workflows. LangChain and LlamaIndex The response from gpt-4o was as follows. ...

November 29, 2024 · 7 min · Nakamura

Uploading Files and More Using the GakuNin RDM API

Background These are notes on how to upload files and perform other operations using the GakuNin RDM API. References The following article explains how to obtain a PAT (Personal Access Token). The following article introduces a method using OAuth (Open Authorization). If you are using it from a web application, this may be helpful. Method I created the following repository using nbdev. https://github.com/nakamura196/grdm-tools The documentation can be found here. ...

November 16, 2024 · 2 min · Nakamura

Building a Character Detection Model Using YOLOv11x and the Japanese Classical Character Dataset

Overview I had the opportunity to build a character detection model using YOLOv11x and the Japanese Classical Character (Kuzushiji) Dataset, so this is a memo of the process. http://codh.rois.ac.jp/char-shape/ References Previously, I performed a similar task using YOLOv5. You can check the demo and pre-trained models at the following Spaces. https://huggingface.co/spaces/nakamura196/yolov5-char Below is an example of application to publicly available images from the “National Treasure Kanazawa Bunko Documents Database.” ...

November 6, 2024 · 6 min · Nakamura

Training YOLOv11 Classification (Kuzushiji Recognition) Using mdx.jp

Overview I had the opportunity to train YOLOv11 classification (kuzushiji recognition) using mdx.jp, so here are my notes. Dataset The following “Kuzushiji Dataset” is used as the target. http://codh.rois.ac.jp/char-shape/book/ Dataset Creation The dataset is formatted to match the YOLO format. First, data separated by book title is flattened and merged. # c | l a e s x s d p e o C f r l t a c # f # f s r i o s e " l o r i a . e u f t . s t f c o i # o s i e / p i l u f s h c _ d = u l s t p . u a d a t e p o r m t t a t g _ = u s c i a i i t a l d i t . o n k l o a / o i n f _ p n t e . n s * b r i f a t ( d c : e / ( t l i t i f i o t c i = q e l h n " r p ( h n d . e . u C s y s a p " m s e e o ( ( e r u . ( p = x p f f l a t . f l i y " i f c _ / i i f s i { l , t f d l t " t n o e e i a e ( { s g u , i r l t s " o ( t n s e a ) / u o { p o p / _ / : " t u f u u u * p d ) p t i t t t / a a [ u p l _ p _ * t t - t u e d u f . h a 2 _ t } i t i j ) s ] d _ r _ l p e i f t } f e g t r i o / i _ " " } l { l p / e { c e a { ) o l ) t c : u s h l t } , s p " } u , o / t u { _ e t f f x p i i i u l l s t e e t _ . } _ d s " o i p ) k r l = ) i T : t r ( u ' e / ) ' ) [ - 1 ] } " Next, the dataset is split using the following script. ...

November 6, 2024 · 37 min · Nakamura

Getting a List of Properties for a Specific Vocabulary in Omeka S

Overview Here is how to get a list of properties for a specific vocabulary in Omeka S. Method We will target the following. https://uta.u-tokyo.ac.jp/uta/api/properties?vocabulary_id=5 The following program writes the property list to MS Excel. i i u p d w r f # d d m m r a a h e o f f p p l g t i m r D . o o e a l o a = t r r = _ e r d i d p v d f t o t t = l e a f a a e a o a p _ " i 1 s t t g _ t r F d e p r h 1 s : p a l a e k a r . x a e t t o e b _ e k i a D c n q t n = n r l + y i e f m a e d u p = s ( e i = s n y e t l a e s e r d a s k に a ( s s : [ e a k t 1 = d i e d 変 F " t / ] = s t . a n y e 換 r a a s / p a e [ t l a r s u r o ) x " a r i m c t e n t @ _ e n d e h p a q s = e c l m a ( i d . u e = n o i o d t d v u e . d n s v a a a e - s j 0 ( t t e t [ t s t t s : d e : _ a k a h o s o a x k : e _ u k . n t t e y l b y g ( a " y ] i . o e ) ) , s s x . t : t l a ( " ) s c u @ x . r i " j l d , p " / + , i u n t " " d a & @ e / p t x a a y = p g p F i e e a / = " l p " , s r e o + " ) p o e s : r t v t r o i ( c e p a s a b ? g u v e l o ) a c ) r a y b " u , l a " r o y : _ i i d d " = , 5 " " o : l o c a l _ n a m e " ] Result The following MS Excel file is obtained. ...

November 5, 2024 · 6 min · Nakamura