Overview

When inputting files with Japanese filenames into Archivematica with default settings, a filename like “ユースケース公募提案書.docx” is converted as follows:

yu-suke-suGong_Mu_Ti_An_Shu_.docx

This article explains how to customize this filename conversion.

Overview

The filename conversion is performed in the following file:

https://github.com/artefactual/archivematica/blob/qa/1.x/src/MCPClient/lib/clientScripts/change_names.py

Specifically, the following line:

decoded_name=unidecode(basename)

An example of running this in Google Colab is available here:

https://colab.research.google.com/github/nakamura196/000_tools/blob/main/unidecodeを試す.ipynb

Customization

This time, we will try using pykakasi.

https://codeberg.org/miurahr/pykakasi

We also assume that Archivematica is running on Docker. Please refer to the following article:

First, add pykakasi to the following file:

https://github.com/artefactual/archivematica/blob/qa/1.x/requirements-dev.txt

Then modify the following file as well:

https://github.com/artefactual/archivematica/blob/qa/1.x/src/MCPClient/lib/clientScripts/change_names.py

iiifi#k#kkkk#cV#ARdmmmrmaaaaaoELEepppopkkkkknRLLPfooomoaaaaavSeOLrrrrssssseItWAci#dtttutiiiiirOtEChfen....tNeDEadcorsip=sssser_Mnbeosehdyeeeer=sCEgarcduekptttt,HNesaoetcayMMMM="AT_eiddiokkoooo1dR_nnse_ldaaddddk.iSCaaedneskeeeea1gHmm_aia((((k0i=AeeVnmis""""a.tR(aaemiHKJrs"srb=lmp.""""ie=a=ue=ok,,,,.+a.seragnc"e"E=ctk""""e"do_n"roaaaaHt$m"a:runus"""eCIapmonvni)))podieriei(bn$fl)(drd)uv"ee:"ete###re.w(ccecnrsrhoro"tpp"ad.d)elu[nederin^g(o(tcaeb(#)(t-_ab"uznsaaAaes"t-mne)iZean[o0ma1n-rem]9e)ec\c)h-ea_ir.va\ec(dt\e)ar]ns")emptyfilename.")

After making the above modifications and rebuilding Archivematica, filenames are now converted as follows:

yuusukeesukouboteiansho.docx

Summary

Regarding filename conversion, the METS file records the conversion as follows:

<premis:eventOutcomeDetailNote>Originalname="%transferDirectory%objects/.docx";newname="%transferDirectory%objects/yuusukeesukouboteiansho.docx"</premis:eventOutcomeDetailNote>

Therefore, you may not need to worry about the filename conversion rules, but we hope this serves as a useful reference.