Overview
When inputting files with Japanese filenames into Archivematica with default settings, a filename like “ユースケース公募提案書.docx” is converted as follows:
yu-suke-suGong_Mu_Ti_An_Shu_.docx
This article explains how to customize this filename conversion.
Overview
The filename conversion is performed in the following file:
Specifically, the following line:
An example of running this in Google Colab is available here:
https://colab.research.google.com/github/nakamura196/000_tools/blob/main/unidecodeを試す.ipynb
Customization
This time, we will try using pykakasi.
https://codeberg.org/miurahr/pykakasi
We also assume that Archivematica is running on Docker. Please refer to the following article:
First, add pykakasi to the following file:
https://github.com/artefactual/archivematica/blob/qa/1.x/requirements-dev.txt
Then modify the following file as well:
After making the above modifications and rebuilding Archivematica, filenames are now converted as follows:
yuusukeesukouboteiansho.docx

Summary
Regarding filename conversion, the METS file records the conversion as follows:
Therefore, you may not need to worry about the filename conversion rules, but we hope this serves as a useful reference.