Overview

When editing TEI/XML files, changing the RNG file used for validation allows you to limit the tags and attributes available. This offers benefits such as preventing workers from being confused by tag choices and reducing inconsistencies in the created TEI/XML.

As a method for editing RNG files, using Roma is common, as introduced in the following article.

This is a top-down approach to limiting available tags and attributes, but this time we try creating an RNG file bottom-up from existing TEI/XML using generative AI.

Target Data

We target the following XML file published in the Koui Genji Monogatari Text DB.

https://kouigenjimonogatari.github.io/tei/01.xml

This file uses the following tei_all.rng.

http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng

As a result, many insertable tags are suggested, as shown below.

Creating an RNG File with Generative AI

Using a prompt like the following, an RNG file is created based on the tag usage in the target XML file.

##--#-#1234#--#12###....##..CCBFECuPruPiRCxlECBsueirle****ocexurtraloeqACDInlapsioptdvunrensuretemoeiniaesctdlcofsadarlailreytmReanemeytguaeuNndemzendiuddRsGe:eeenneNaRnFnaftnfDGgSNvi0ttnocseieecGil1shro/cnlfhre.eRmPeeiigesoxNemosvlumcnmcGfelsreeiahmlufniaerdeersetcrqaeCmnrccsyyub.rateht*ilrenei*erenalwtmvlesgtihaesed)imeXpmoirMwaeeanteLiucntettittRdwfhofsreoi-yiqtrlaci(buokeoneuee*mmg.tsur*ipgetssnlt.seaieh,fdcnmteoadaiXrenlopXlinu,XeedyrMmdeeapYLeintnoYnttdsFtiseaiswfuvrl/iyfaoeeatflfthaiintccdeoratiaatiuetcbcanihuuoltostnlneeesys*ldsie*e)sutmftseoeeonrndfttieemsllpteerrmmoueevcnnetttdussr*wea*onrwdkitaehtfotfuriticbicueotnnecfsyusionintagselection

As a result, the following RNG file was created.

<<<?!g/x-CC[---r<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<gm-urPas/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/!d/rlseuLECmt<s-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<d-e<<d-e<d-e<d-e<datarinlmart-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fe/e-fa/z/e-fe/e-fe/e-fe/emvotpmaeareail<<<<efil<<<efil<<<efil<<efil<<efil<<<efil<efil<efil<<efil<efil<<efil<<<efil<<<<<<<efil<<efil<efil<<efi:Eae<zfil<<efil<<<<efil<<<efmemeoibartfr=R=nearrrli=H=nea/rrliBnerrrliTnee/e/oliRnee/e/liPnee/e/e/liSnee/liEnee/li=I=nea/rliInea/oliInearoliInea/a/eliRnea/a/a/a/a/a/eli=T=nearliBneaoliPnea/rli=T=ndl:rc/eiLnea/eliPnea/a/a/eliTnea/a/tliardstlr>t=o=em:eeeen=e=em:M--aeeeniemeeeenieml<el<en<oeneeml<<el<eenueml<el<<el<eenoeml<<<eenneml<een=m=em:A-aeenmemt<<an<oennem:en<oenmemt<<at<<ameneem:D-at<<at<<at<<at<<at<<amen=e=em:eenoem:n<oenaem:P-aeen=e=eoedoh<<<<crniem:L--amenaem:P--at<<at<<ameneem:T[---at<<aeenrsR:eeexxdnn>=o=edfffme=a=ede:ffmebefffmeteetletlernmeseea/tletlmebeetlea/tlee/lmeueee/e/e/lmeceem/lme=a=edr:fmeaetad/ternmededfernmeaetadttadtpmegede:tad/ttadttadttadttadtpme=x=edfmededernmereda:fme=x=cmoOotrrrhoenedi:pmegeda:tadttad/tpmexedeI:ta/d/txme>iN]dsmmasa=t=nnoe>=d=nnotfede>lnne>lnnmeemeeOeee>pnnmt<<aeemeee>lnnmeemt<aeeml<eee>rnnml<el<el<eee>onnmimee>=g=nnoeDde>gnnr:a<dtOeee>innoOeee>gnnr:atr:atte>innofSdr:a<dtr:atr:atr:atr:atte>=t=nnoe>ynnoOeee>annorTde>=t=nuecrieeeeoO>ennonIUdte>ennogIMdr:atr:a<dtte>tnnoxMDELdr:LFaa<dtte>oG2etllt=m==atcnnnn=e=atcainonnniatnnnneatexmexmrfOnoatetadtxmexmniatexmetdtxmeee/lmncateetletletlmndatexoimn=e=atcaeonneatidtparrfOnvatcnrfOneatidtridtrynoatcipoidtparidtridtridtridtryn==atcnnatcrfOngatcayonn==amnuMcxfffiratcensoynatcenuoidtridtparynatctPiaioidio:tpar/nn0tfrnna"e=E=muaaat=r=mudlccaatomaaatmntenteMrtnmnr:attentetcmntenrattenml<<eeetemnmeemeemeeetimnep<xet==mufcatmboaatiMrtimuaMrtmboaiboai/tnmunecboaatiboaiboaiboaiboai/t=B=muatmuMrtrmugpcat=C=metmoetcMBmudec/tBmudscboaiboaati/tSmuOvcncbonrdaati>t=S2ofussth==l=enmmmm>==enmaeoumm>genmmm>Ient/nt/nonM>sentidtr/nt/n>aent/ntitr/nteea/tlmn>entexmexmexmn>nentdtepen>=R=enmfium>GenucrabonM>denmmonM>Lenucbucb>>enmecuucrabucbucbucbucb>>=o=enmm>enmonM>aenmrium>=o=enser>/nnneorenmbidu>>oenmbituucbucrab>>eenmsRihkuuckmorab>"c4ec=:yt"=e==aeeee=S==aetDdmeer=aeeen=a>t>traoi=aboai>t>tt=a>tbai>tnmt<aeeetD=antententetg=a>il<<etdt=e==aeonmer=atuta>uraou=aeeraoi=atutututuD=aesimtuta>ututututututututu=d==aee=aeraop=aeacme=n==tne>aaa>re=aercmu=aeocmtutututa>ug=aeeTdemtuacta>u1hoct"aptT=m="mn====e="mnaeie==a"m===f"mn>n>emrb"mnucb>n>i"mn>nub>ntetdtxmn>e"mnt/nt/nt/n>"mnoea/tli>>=f="mnree=o"memymtemra"mn=emrn"memytemyte"mnfeemymtemytemytemytemyt=y="mn="mnemrh"mnpae==t="aat>mmmea"mneaaen"mnuareemytemymtm"mngAesdeemttuymt.entuh=epE=e=Tet"""=c=tetsnn""pfe"""oteaa>eeireatutuaopeaattuanrattetssea>t>t>tDeeanmt<aeeo=e=fetsn"useepe>eelset">eekgeepeepefzetainepeepeepeepeepe==tet"bet>eepethln"=e=ttvaeee>kletatsndpetntenepeepeesetmNdeneo:mpe0mlirt"L:I=n=E=atft=t=e=aacgtfehi=tpsri=mm=>le=memytmnu=mmeytmntitr/n>co=mn>n>n>en=maetdtxmn=r=a=amtspu=nnen>=>u=ag=>r=nne>nne>io=aetnnen>nne>nne>nne>nne>=(=e=abo=a=>"=altt=n=eiat===b=aketab=adeftnne>nnen>ne=aeTgttnneen>"ayvethi/"=t=I"teae=i=i"tr:Dainil"iuomt"ee"is"eepeeb"eepeeabai>tru"eaaasc"elnrattea=e=c"taIaur"at=a"Ir"tr"a"at=at=nn"trsaat=aat=at=at=at==M=x"tod"t">"teyae=t=xoii"""""tsaar""taseaat=at=atg"tn]bmoaatehn=a:eptb//=="Tiicx=o=Htieetlcceftbualt==rtpr=nne>=Ilp==ne>=mnub>irs=mmmcoe=>titr/nl=n=sfinItrfsma"msmfsiazpgma"ma"iezietma"mma"ma"ma"ma"=a=ttidybippiltx==tnlolps>limnty>pirrtma"ma"m"sityetmaxtt"meas:trw>==>EoHst=n=eeoaBsieoDillrtei""eySe"at="niu""a="eattupco"eeerdn"bai>t>=c=iaoaIifauetaeuaauopohretaetat"ooctietIeetnetnetnetn=i="eoy"oo""oeuit=M=C>anbbeboatiboyteietpetse(>eoPneietttaaenFcao/paw==Inei"==ainicoDdIeleiciSttastsrma"nfcbddm"a=meytteub===iicpnub>=e=mcngFoacr=in=rgcrnhnia=in=ini>nntho=iD==io=io=io=io=n=>xn">dn/>nmsoC=o=ob>""g"nrheo"nhno=io=it=Mgnrltxo=ieptn=cotu/:rw==">am/=(=dH>(b:neinseSceotliupSmpeetdaoaliaedv"epeiDri"""pno"attu==is>i-ncef"oy"feef>iecp"oy"oyoe>aen"o"""on"on"on"on==t>/y>>>eeno=d=nl//">>kemn>>mecn"os"or"o">iitn"orsiy"orutwr/y.==>Rdi>=N=eenl>snfcDtaDnmettSttSs=iamrtist=aap=ne>oecbatptgd>meyt=S=linc>eGafnUpa"aRc""hnnUunUn"n>xn>plnNlnNunNunN=E="C>"Bns>n=e=te>>/ep>ae>nnifnips>mnce>cnn:Updaooe/=t==oel=o=raoiEcgo"emtetSlhta"tp"otemicte"ti""a=nseluiuiDiepe=e=emgoGrca>RacI>ce"/>i">Rr>R>grm>ar>er>el>el>e=d=>o>ott=l=e>rltrs">ta>nataeoro>anRaiKl-rlr"e==ore=t="dton"Dr>stis"teomt>m"wne"aoar"wel>rm"cD"ttboen=ne>=c="imrpecIItenep/>c>IIlDIoualZtxLgyLgxUgyUg=i=ndoe==nwiykta>PicRgtrrnrl/>Itnolckaehi==t""==>egc/emc"c>m"rtet>h>">tnti>h"aeea"e>hllnsg"a==t=>lipp"GsI"t"f"r>"I""i"nln:ot"oa"oa"pa"pa=t=tynn==tineaavse"tIyorarw"tguyoexlt-//Srmro>sa"n"t>""m"eWiSibe/bf=n>soeicDrm"iema">r"I>e/o>e>I/>r/agine>wt>wt>pt>ptiaet"teerrcge"f>emreler3>e=imrnatce>>u"oadct>S/">/e>notoun>i""ycr"s"eeeao"at/p>Fr>rsF>e>Iredereieieieinia">hltoeI>erpesseir"upsgxp.lb>dpi"it>>>n"rnmnt"l>tU"">h>sf=nn>gi>"nmecm"nrvrvrvrvgnrp/ibernrnodspdpsdnUGsl.n:ejihn/omt>ktSo>iaR>>ec""yeb>M"anCtaro>I"-e-e-e-eee>nrmornte"rieoa"o."Teeedogrmefig>nt"trtrIr">tUla>ttagefD>rIrIlIlIAraeeefeuen>ttmnt>uo>Fndtor.gecic"d>m"yg"">aRienhisniegzininenenr<ltansmgc#aiadarrh-jingw/ntei/at>"e/>rInitovmip(ogtgtftftef/ehktapbeeznnnsbcgt8ieo'rwntdin>t">t>g"fiftnoaaoaznheheteteaoameoerotgtae/t"lntngwstnfe>"e/omepnsgngoetgtggg)r:epnnr"tnitskp?Mes./oodfo>t>rasseeen_eeXeYednboed/oeEtcoe(os>omasnw1fuor("mgt:Uose\XrYrrrtotoswi(>_laru:nentss3.ErrmY>ae/iRUn_d""c"c"hcdinnc\eruavei/ondrt/.0tdimaYtU/mLRoX{ccooeufytpguodmgnniqg/gturco"hinatYiUR.a<LarX4moiamr{eeiauewasguor>etgtiYoRL*g/<nX}oorrtertogzer4nttein3t/agcmgiio-nL\ea/cX<rrddeneneore}txcrjiaalltp/TneonMs(.:aio/ddiixttxnis<)etoeidrtieua2EgdnMrjsd:mlfpiinntahtiecp/lerdm\itdrt0I)i-aesuodauoannaaten)operr)o.rawei0tDnqorcogmrraattbie<namnenoTiti/b1diDdunfucenmatteeooetl/dreasorEbit1i/oni<amusamee<<dnnheaianlpggIuoh.lXcgfzr/cemt><<//y>tem:nmta/tn0iMuooepene)//aa<iedg>ratkFet"tLmwrnda<tn<aa::/ronoetaoisaySeome)r/at/::ddaertczstrulg/cnra<aataaddoo:iuooriiefahtktc/m:it:ooccddgmnui/gsosne<))oa>doidccuuooieerbaerenm/<o:onoouummccnncupnloaa/rdc>ncmmeeuuat(etijiet-:adou>ueennmmla#e/imcadd:icmmnntteetzwimpttaodnueettaannioitoriitcoamnnaattttontenoooauctettttiianehmovnntmuenaaiioot>_sgesyemstttoonniXa/ad/pneaiinn>>oXXt1etntoo>>nXuXaw.satinn>XnXro0"tao>>iXir"itnfq-/koi>ouXanoreXpe>nm.if>aIj/ftDsii)otc<nei/measn:/cd\yodc{u4m}e-n\tda{t2i}o\n.>json</param>

Applying to XML

Apply the created RNG file as follows.

<<<<???TxxxEmmmsI<lllct--hxevmmemi<eoomlHfrddaneiseetsal<illy=detop"eDinhhehret<=rrnt>slt"eestcei1ff=p>St.==":tl0""h/me"00t/t>11tw>e__pwncc:wcuu/.osstdttpeioouinmmr-g..lc=rr.."nnooUggcr<T""lg/Fct-tt.ni8yyost"ppr/l?eeg1e>==/.>""d0aas"ppd>pplll/iiscccaahtteiimooannt//rxxommnll"""?>schematypens="http://relaxng.org/ns/structure/1.0"?>

As a result, the available tags are displayed in a limited manner, as shown below.

The available attributes are also limited.

Furthermore, the attribute values are also constrained by format restrictions.

The same restrictions apply in Scholarly XML for VSCode.

https://marketplace.visualstudio.com/items?itemName=raffazizzi.sxml

Building such a working environment is expected to reduce inconsistencies in deliverables across different workers.

Summary

I hope this serves as a useful reference as an example of creating RNG files bottom-up.