Overview

I had the opportunity to try text annotation using Transkribus, so this is a memo of my experience.

Transkribus is available at the following link.

https://www.transkribus.org/

It is described as follows:

Transkribus enables you to automatically recognise text easily, edit seamlessly, collaborate effortlessly, and even train your custom AI for digitizing and interpreting historical documents of any form.

References

The following was very helpful as a Japanese-language explanation of Transkribus.

https://connectivity.aa-ken.jp/ja/newsletter/588/index.html

However, the desktop version “Transkribus eXpert” introduced on that page has been deprecated.

https://help.transkribus.org/downloading-and-installing-transkribus-expert-deprecated

Please note that Transkribus eXpert (desktop software) is no longer being updated, and all new features will be exclusively available on the Transkribus web app.

Sample Data

I also created an article about using Recogito.

As with that article, I use the following published by the National Diet Library as an example.

How to Use

Access the top page.

After logging in, you will be taken to the following home screen.

Navigate to Collections.

Clicking a specific collection takes you to the document list.

You can add documents from the upload button in the upper right. Registration using IIIF manifest files was also possible as shown below.

Opening an imported document takes you to a page-by-page view as shown below.

Clicking an image takes you to the editing screen as shown below.

I created Regions, then created line-by-line rectangles within them, and entered text.

Export

Page-level data can be exported as TXT or downloaded in a format called Prima Page Content XML.

The result downloaded in Prima Page Content XML format is as follows:

<<?P/xcPmGclt<<GsM/P/tveMaPsextega>rma<<<<te<<gsldCCLTaR/T/einarrardieReT>osteesaamaexen=aaatntad<at<<<<<<<<<<<<<<<<x=">ttCsagiO/dRCT/T/T/T/T/T/T/T/T/T/T/T/T/T/T/t"hoehk>enrOieoeTeTeTeTeTeTeTeTeTeTeTeTeTeTeTR1trdarFgdrngoxexexexexexexexexexexexexexexee.t>>niiOe<dgirt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<<<xt<xg0pT2gblrrReOodLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tLCBT/tEUti":r0euedeerrnsioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLioaeTLqnEo/a2>snedgednosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeinosxeiuiqne/n52MarGideiperet<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xneret<xnicu>nss-0em>roGrdodlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtedlEUtevoicck02teonr>=iisiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>isiqnE>>dvohr75a=uRo"ndnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqdnuiqe>dei--d"peurt=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu=peicu>imb20a0fp_s"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi"ovoi<nau37t0iI>7=lip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dvlip>dv/g.sT-a0dn9"_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>_noe>U=p<021=d"43ti>3ti>3ti>3ti>7ti>1ti>2ti>2ti>2ti>3ti>1ti>1ti>2ti>1ti>n"r/93d_"e29sn6sn2sn0sn8sn2sn3sn5sn8sn4sn9sn4sn1sn7sniUiC:Todrxc11=t9=t5=t3=t"=t7=t7=t9=t1=t7=t3=t9=t5=t1=tcTmr41cioeu,""s""s""s""s"s""s""s""s""s""s""s""s""s""soFae90Ig_ds84=<7=1=1=c4=4=2=2=2=1=3=3=2=3=d-ra::di1t5c4"/c5"c4"c7"u3"c1"c6"c3"c0"c0"c3"c8"c9"c5"e8et51=d7io6u34Uu57u01u01s14u04u32u72u22u51u03u23u52u63>"so38"e5nms,4ns,5s14便s17t53s11s96s33s80s90s43s28s49s85er.:9p3d=4t83it85t,0t,0o,1湿t,0t,3t,7t,2t,5t,0t,2t,5t,6sa>626o2e"6o9,co9,o91o91m95o91o99o93o98o99o94o92o94o98tr369_6xr7m38om88m1,m1,=1,m1,m1,m2,m2,m2,m3,m3,m4,m4,ac1.720=e5=9d=9=09=19"39=39=69=19=39=99=39=39=49=59nh+8552"a,"43e"88"1"1r1"1"1"2"2"2"3"3"4"4d.076800d8r5>r0r10r11e43r43r26r21r23r19r33r33r34<r35ao23756"i9e84e48e4e8a4e2e7e5e1e1<e3<e9e1/e7lr:+"19n0a,5a,0<a81<a11<d54<a54a22<a02a32<a61/a63/a63a13Ua03og0062rgd18d34/d34/d88/i34/d72d97/d65<d61/d81Ud13Ud49<d81nd47n/02p49eO4i7,i2,Ui,8Ui,1Un,5Ui,5i,2Ui,0/i,3Ui,6ni,6ni,6/i,1ii,0eP<:a_"gr6n71n23nn33nn38ng33nn37n39nn36Un36nn38in31in34Un38cn34<=A/0gpid7g07g92ig2,ig2,iO2,ig2,<g2,ig1,ng2,ig2,cg2,cg2,ng2,og2,/"GC0encoe5O7O2cO23cO03cr53cO13/O13cO93iO03cO03oO23oO23iO23dO13UyEr<I_anr,r70r19or12or02od62or02Ur12or51cr22or92dr52dr32cr02er92ne/e/dnpR3d2d0dd2dd0de5dd1nd1dd9od0dd0ed2ed2od2>d1isgaL=ute{2e57e81ee11ee20er46ee40ie31ee25de22ee19>e35>e43de30e39c"tta"lifi5r,2r70>r7>r0>6>r4cr0>r6er3>r4r6r2er3r9o?ses1lo=n615,85172{7434o3392>82113304>2313d>/dt0_n"d{7,{37{67{20i56{74d{80{36{73{34{46{32{53{49ep>C02=re4i61i2,i,5i,7n,7i,3ei,3i,9i,8i,1i,3i,0i,2i,1>ah67"_x6n37n43n36n32d35n37>n38n33n37n33n34n33n35n34ga72R7:2d6d12d2,d2,e2,d2,d2,d2,d2,d2,d2,d2,d2,d2,en21e90,e73e4e03e03x53e13e23e03e03e13e23e53e03e13cg62g";3x1x11x42x32:12x22x92x52x82x12x52x02x62x52oe93i}3:77:0:0:045:1:2:0:0:1:2:5:0:1n>63"00,1111214313;41542629725828911135140136135t"4n>7;87;30;6;9}5;3;8;5;2;305102237e-s"}7,},1}51}71"54}14}92}92}82}21;23;44;53;93np9/"98"93"66"69>35"43"88"95"82"03}85}10}92}07ta6r>>7>0,>,5>,7,5>,1>,9>,9>,8>,2",2",4",5",9/gae49799696839498999890>98>91>99>902e2a301,1,9,1,0,1,0,0,3,1,2,3,0N-d84779919080989191949298959091r4i,341191010031233=8n8877191140402821211432382530-"0g8,,436306300289501348879191149432622251932352637"-o89,838623905683909082185975-br4408,9,9,1,9,3,6,2,5,9,2,5,61tcd3"99898929995989999989195975s0e9/701,0,0,1,2,2,1,2,3,2,3,4,"i6r,>4"596949690909990979898909d-"95/100122123234x=1>1,>151644462020291037382830m"0493"6"3"1"6"3"0"0"2"8"9"5"l2729/9/1/0/3/7/3/6/9/2/5/6/n5c608>7>3>0>6>0>1>0>9>3>6>8>s1d8,,,,,,,,,,,,:4b99999999999999x5a,8433445456567s8895564690907880i370,=4c89114422213433"8c3695285225027h"c64242877690136t.9896712131423tsj71,,,,,,,,,,,,ptp,0999999999999:ag15432334336455/t"78790970141849/u3,wsi53124432213433w=m2706406336128w"a41244006580798.Ig82636794844464wNe5,,,,,,,,,,,,3_W,8333333333333.Pi13112121111211oRd72772807789188rOt4,754406937907gGh13/R=21144222133332E"405842751139170S8721488536989430S230504733582511"9,7,,,,,,,,,,,,/288333333333333Xu"95112111111111Ms2,972886789999Lei"8127247416700Srm/9cIa>8114422213323hdg"473164003895e=e/334360583589m"H>115193894248a4e,,,,,,,,,,,,-3i999999999999i4g001112223344n4h991140282123s1t""""""""""""t"=////////////a">>>>>>>>>>>>ni4cm7eg9"U4r"xl>s=i":hstcthpesm:a/L/ofcialteiso.nt=r"ahntstkpr:i/b/ussc.heeum/aG.eptr?iimda=rLeZsOeBaFrYcAhO.AoKrIgR/APOAXGQEM/AgPtMsU/PpKaXg&eacmopn;tfeinlte/T2y0p1e3=-v0i7e-w1"5xhmtltUpr:l/=/"shcthtepmsa:./p/rfiimlaerse.steraarncshk.roirbgu/sP.AeGuE//Ggetts?/ipda=gOeUcNoYnStFeBnLtL/N2L0D1Z3U-N0C7D-N1F5E/JpDaRgXe"coinmtaegnetI.dx=s"d8"4>474111"/>

Summary

I confirmed the process up to creating rectangles and entering text.

I was unable to try the distinctive features of Transkribus such as HTR model training and inference, so I plan to explore those in a separate article.