Overview

I created a library that applies Google Cloud Vision to image files and generates IIIF manifest and TEI/XML files.

https://github.com/nakamura196/iiif_tei_py

This article explains how to use the library.

Usage

You can check the usage and more at the following page.

https://nakamura196.github.io/iiif_tei_py/

Installing the Library

Install the library from the GitHub repository.

pipinstallhttps://github.com/nakamura196/iiif_tei_py

Creating a GC Service Account

Download a GC (Google Cloud) service account key (JSON file) by referring to articles such as the following.

https://book.st-hakky.com/data-science/data-science-gcp-vision-api-setting/

Then create a .env file as follows.

GOOGLE_APPLICATION_CREDENTIALS=your-google-credentials.json

Execution

As a sample input image, we use the following image that is also used in the IIIF Cookbook.

https://iiif.io/api/presentation/2.1/example/fixtures/resources/page1-full.png

Create and execute a file like the following.

fcuoCrrruooeltrmdpe_=uCiptlia"_iithtefhten_titt=p_.esxciC:mr_o/lepr/_ayeift.Ciieclil_oifetre._eenipitoa_i./txmlahmpoploai=_rd/wt_p"ier.tCne/hovst_r(emge)npCtcla0rit1(eiunortnul/t,2p.uo1tu/.texpxmualtm"_ptleei/_fximxlt_ufrielse/_rpeastohu,rccerse/dp_apgaet1h-,futlilt.lpen=g""Sample")

In the above example, the IIIF manifest file is created at ./tmp/01/output.json and the TEI/XML file is created at ./tmp/01/output.xml.

Verifying the Results

IIIF

Below is an example of displaying the IIIF manifest file in Mirador.

The contents of the JSON file are as follows.

{}"""}""]@il,ticdayto"bpen:e"]em{}tln"se""o:"xh:n:tte"""""}"""]"]"t{"SM[itl,hwi,a:p:aadyaeitn:mn"pbiden"/[pi:ee"]gtm{}o{}h/lf"lnhhstteee":"ot""atx"sh:n"::tpatt"e":"""]i"""]:m"tC{"[1[itioiti/p,pa:112dytndyt/l:n]80"pes"peie/["00:em{}":em{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}i./a0,"s:"s,,,,,,,,,,,,,,,,,ioes,":"":"frx"h:[h:.ga,t""}""""t""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}"""""}""""imtA[b,itmttA[b,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtb,itmtippnodyoapnodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoaodyoa/il:nd"ptr:nd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptrd"ptraie/y""""":eig/y""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigy""":eigpf.t"itfhw"vet"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"ve"tvf"veioea:dyoei":atea:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":at:yao":atarxt"pridht"xtplrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"plrht"pbgai{:emgtt"i:ai{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:{eumt"i:rc/mo"ahhtAomo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAo"eatAoe/ipn":tt"pnn"pn:"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn":"tpnn"smilPh""::n"hlP:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"h:":n"heaieat"::/:tea":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:t":/:tnnf.gtI1tt.gT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttT"ttti/oepm"12ea"poee["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pe["ea"pafar"sai80xtp:r"x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:x0txtc:tebg,:gm00aia/g,t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/t0eaio/isc//ea0moi//u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/u0xmom/ot/i/"g,pnneia0tpnmea0tpnmea0tpnmea0tpnmea0tpnmea0tpnmea0tpnmea0tpnmea0tpnmea1tpnmea1tpnmea1tpnmea1tpnmea1tpnmea1tpnmea1tpnmea1tpnmea1tpnmen"cii,el"txil1/l"exl2/l"exl3/l"exl4/l"exl5/l"exl6/l"exl7/l"exl8/l"exl9/l"exl0/l"exl1/l"exl2/l"exl3/l"exl4/l"exl5/l"exl6/l"exl7/l"exl8/l"ex/,aii/e,iaiB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,naB]pe,na3nfij.nmfol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tmol.tm/v/fpogp/dTaoipdoaoipdFaoipdPaoipdtaoipdDaoipdMaoipdoaoipdFaoipdPaoipdoaoipdAaoipdBaoipdoaoipdFaoipdPaoipdtaoipdDaoipcaa.er"layoirnlyfirnlyiirnlyairnlyoirnlyiirnlyiirnlyfirnlyiirnlyairnlynirnlynirnlyoirnlyfirnlyiirnlyairnlyoirnlyiirnlosbigg,eb"pngge""ngge"rngge"gngge""ngge"sngge"dngge""ngge"rngge"gngge""ngge"gngge"tngge""ngge"rngge"gngge""ngge"snggen/co"/.c,""/".,,"/".,s"/".,e"/".,,"/".,p"/".,d"/".,,"/".,s"/".,e"/".,,"/".,l"/".,t"/".,,"/".,s"/".,e"/".,,"/".,p"/".tp//,io/,i,oi,oti,o"i,oi,oli,oli,oi,oti,o"i,oi,oei,ooi,oi,oti,o"i,oi,oli,oe1aaircirir"ir,iriraireirir"ir,irir"irmirir"ir,irirairx"npigaigig,igigigyig"igig,igigig,ig"igig,igigigyigt,nif/nf/f/f/f/f/"f/,f/f/f/f/f/f/,f/f/f/f/f/"f/.o//iv/i/i/i/i/i,/i/i/i/i/i/i/i/i/i/i/i/i,/ijtpaiaaiaiaiaiaiaiaiaiaiaiaiaiaiaiaiaiaiaisarbisbibibibibibibibibibibibibibibibibibiotecf/cfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfnis//p////////////////////////////////////"oeaa1cacacacacacacacacacacacacacacacacaca,nnnb/abababababababababababababababababab/tnccncncncncncncncncncncncncncncncncncncpao/uv/v/v/v/v/v/v/v/v/v/v/v/v/v/v/v/v/v/0ttcracacacacacacacacacacacacacacacacacac0iaaasasasasasasasasasasasasasasasasasasa0otnt/n/n/n/n/n/n/n/n/n/n/n/n/n/n/n/n/n/n1nivipvpvpvpvpvpvpvpvpvpvpvpvpvpvpvpvpvpv-/oao1a1a1a1a1a1a1a1a1a1a1a1a1a1a1a1a1a1ai2nsn#s#s#s#s#s#s#s#s#s#s#s#s#s#s#s#s#s#sm.//"x/x/x/x/x/x/x/x/x/x/x/x/x/x/x/x/x/x/a1pp,ypypypypypypypypypypypypypypypypypypg/01w1w1w1w1w1w1w1w1w1w1w1w1w1w1w1w1w1w1ee0"h#h#h#h#h#h#h#h#h#h#h#h#h#h#h#h#h#h#"x0=x=x=x=x=x=x=x=x=x=x=x=x=x=x=x=x=x=x,a12y3y4y5y6y7y2y4y4y5y6y7y2y3y4y5y7y8ym-4w5w2w4w9w6w9w3w8w8w9w4w0w9w6w9w4w0wpi5h5h0h7h7h3h6h3h4h4h1h9h3h8h6h3h0h8hlm/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=/=ea626364757677527474859617121314151718/g949592041916993388889904606966696460fe/5/5/0/7/7/3363364948159838886939098i/9,4,1,1,5,1,/,/,/,/,/,7,6,9,9,0,2,3,xa4696163707871577171889/1/1/1/1/1/1/1tn/9/92940/191696328280910165616165616un5,5,/,/,5,/,33/33689/845781808391999ro292451512551/,7,/,/,8,8756/99900/203e""4"94133"0481167111108/,/,5,/,/,5,/,s,,,,,"2"4,,"966"62222"011513551514551/55,,,,5,,43,,4398,,4457"14043"149r225525",7",",898"5,,"9"0,,"0e""43"4,16,1,10",,,5,,,,5,,s"""6"22",1535545o44945"44"4u"""9""""r"ces/page1-full.png",

TEI

Below is an example of displaying the TEI/XML file in Oxygen XML Editor.

The contents of the XML file are as follows.

<<<??T<<<<xxEt<<<<f<<TmmIef<<<<<<p<<tas<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<fElliit</p</s<frc<<pecugz</z</z</z</z</z</z</z</z</z</z</z</z</z</z</z</z</z<saI-xHlittuppopsiorl<crisrroszoszoszoszoszoszoszoszoszoszoszoszoszoszoszoszoszoszuc>vmmeetiib>uu>olfeiclroHifaneoneoneoneoneoneoneoneoneoneoneoneoneoneoneoneoneoneorseolaDlttlEbrEueiashiefemapegnegnegnegnegnegnegnegnegnegnegnegnegnegnegnegnegnegnfirdndeellixlcxrDlttasaiaich>e>e>e>e>e>e>e>e>e>e>e>e>e>e>e>e>e>eamsesesSeecaieaceeiCnttldleilT>lo>lF>lP>lt>lD>lM>lo>lF>lP>lo>lA>lB>lo>lF>lP>lt>lD>ciil=rct>SamcDmesDohgCieee>crorfriraroririrfrirarnrnrorfrirarorielo">>mSttpaepDcenaehoDr>xpx<xrxgx<xsxdx<xrxgx<xgxtx<xrxgx<xs>enhhtamiltsle>s>n/ane>u=<=/=s=e=/=p=d=/=s=e=/=l=t=/=s=e=/=p>=rt>mtoeicescg>n>sr"/"s"t"<"s"l"l"s"t"<"s"e"o"s"t"<"s"l"etp>n<><c>egcl3s4e5<6/7e9a4e5e6<7/7e8<3m4e5<7/7e9a1fplSn/>>e>=3e0g3/8s4g5y5<0g0/1s7g9/7<4g7/2s9g9y.=:etpSp>"9g4>2s1e7>2<9/9>7s2e1>7s8/9>5s3e1>8<0"/<m>t>h">""e"g""/"s""e"g""e"s""e"g""/"h//tmtg>seg>geg>stwt>ttlll>lllelgll>lll>lgll>llle?twi>prrrrrrgr>rrrrrr>rrrrrg>pwtsyyyyyy>yyyyyyyyyyyy>s.l:==================:te/""""""""""""""""""/e>/111111789111111111/ii222222501002777777t-i113335790170444444eci"""""""""886123467i.f"""""""""-o.uuuuuuuuucrillllllllluuuuuuuuu.goxxxxxxxxxlllllllllo//=========xxxxxxxxxrna"""""""""=========gsp234567244"""""""""//i452496938567234578r1/550773634894096940e.p"""""""""419386308l0r"""""""""e"euuuuuuuuua>sllllllllluuuuuuuuuseyyyyyyyyylllllllllen=========yyyyyyyyy/t"""""""""=========xa666777577"""""""""mt999011938891111111li""""""336890666666/o"""985888999tnxxxxxx""7699023e/mmmmmmxxx"""""""i2llllllmmmxx/.::::::lllmmxxxxxxxc1iiiiii:::llmmmmmmmu/ddddddiii::lllllllse======dddii:::::::tx""""""===ddiiiiiiioaaaaaaa"""==dddddddmm______aaa""=======/p000000___aa"""""""sl000000000__aaaaaaace00000000000_______h/012345000000000000ef""""""678010000000mi>>>>>>"""901111111ax>>>""1234567/t>>"""""""ru>>>>>>>erleasx/nrge/stoeuir_caelsl/.prangge"1-tfyuplel=."panpgp"l/i>cation/xml"schematypens="http://relaxng.org/ns/structure/1.0"?>

Summary

I hope this serves as a useful reference for use cases such as creating pre-proofreading text using Google Cloud Vision.