Overview

I confirmed the workflow for processing Dataverse data with Archivematica, so here are my notes.

Background

Archivematica provides a feature to input data from Dataverse.

https://www.archivematica.org/en/docs/archivematica-1.17/user-manual/transfer/dataverse/

I learned about this feature at the following lecture, so I decided to try it out.

https://www.kulib.kyoto-u.ac.jp/bulletin/1402322

Dataverse

I used the Demo Dataverse that was also used in the following article.

I uploaded the following data.

https://demo.dataverse.org/dataset.xhtml?persistentId=doi:10.70122/FK2/IHQZL3

From here, download both the image data itself and the JSON data. Go to the Metadata tab and select JSON from Export Metadata.

Below is a part of the JSON file. metadataBlocks contains the metadata and files contains the image file information.

{}"}"]m,fe"}i{}tclai"""]e"""""}dtdnfslrvddaaiai{}{}{}{}{}{}{}"aeeaa"""""""""""}""""ttsme,,,,,,:bsrttippfcffsrmc,tcpfaipel""""""""]""""]""""]""""]""""""""etsaadeiioritodh""aruiBol"dtmtvtmtvtmtvtmtvtmtvtmtvtmtv[lrisF"rdlniloo5etvbebllna:syuyayuya{}yuya{}yuya{}yuya"yuyayuya"ioei:sUeteert"cyaualeo"y"plplplplplplplplplplAplplplpl:cntliRnensaD:kplltiAc:N":eteueteu"}"}eteu"}"}eteu"}eteureteueteut"Ve2sLandigaseuaicckacNiCeNiCea,aNiCed,dNiCedNiCetNiCeNiCe"e:e"5t"mtlzet"u"eroacs{mi[apl"apl"u""""u"""""}apl"a""""a""""apl"s""""apl"sapl"apl"ndr:1e:eTyeIa7m:"Dnte"etmla:mla:ttmtvttmtvemla:ttmtvttmtvmla:Dtmtvmla:mla:mla:a"1s4n"yT"dF2":aDis:"aeeseeshyuyahyuyax"""eesayuyaayuyaeeseyuyaeesaeeseesk:,i{7t":py:eif:"taos:t""s"""s[oplploplplpst@""s[splplsplpl""s[splpl""s[n""s"""s"ao2Ihepnl0M"atnR{i::"n::"reteureteuacet::"eeteueeteu::"ceteu::"d::"N::"2mfn4dt""e5te8{D7"eDe"o:a:NNiCeANiCenhry:tNiCetNiCe:rNiCe::a:0uaI,"tn:"3iIa52:"aqCn"fk"taapl"fapl"demp"tCapl"Capl""tiapl""tH"fk"f2rld:pa:6fd8"f:tui"ta"aar"mmla:fmla:emNedr"omla:omla:dr"pmla:sr"uda"ada"5as"sk"5i"b,0feet,ilpmuuceeesieesdea"aucneesneessucteesuucmelpmalp-1e:":ai"6e:08a""satsruteo"""s"l""s"v"m:teot""s"t""s"Deoi""s"beoapsrutsr09,dmmJ,r7al2:ttleirh,m:::"Ni::"ha:ea,ma::"Na::"ne,mo::"Mj,nnoeireei162/uaP"-b8s0"ie,maop:aa:tl""spc:ac:aspn:yetis,maO,m-.8idrgE:1abe2":o"i1ro{"fkt"ftu":heot"fkt"f.coV"fcrtii,fi1j1:oaeG,c0,52n,t9"uaa"aiaa"pehttuNda"aEda"kruada"FtoittDt9p01i1/"b7-0ti6,nulpmoulps"t"tCnaalpmmalpainlslpi"leoiSei"g90.9jIseb02rMv"dtsruntsr::tTpodmtsruatsrmpduDsrr,lsrvapv"3.o6pm33a15uee"heir"hei/phsn"eaeiriaeiut"eeeise""etoe,,7r.ea:bc--et",o,ma:o,m/{:e:t,"s,mals,mri,"s,mtd,"os"0gjgg/5b10a,ri,rir//a:ei,"eiao:ciV,ri,1/p"e/ce91dNt{Ato/U/ctt:tt.nrtDout21g,"df3"-aaiSfirwnst{CiSCi1"{iiac""20",e0b,1tmvafv.wic"ova{ov2,pvta,/.,m259aeetieowvh,netne6teabF7o1c""""ol"r.eet"ot"3i"suK0-9f,,,,ri,ggrma,ra,@o,el21d10ua/rsacucgnta/2a02"t0ii.t"tmV"rI2tf1i5dtoNEaayH/ad9o7.yramil"QFv21nzagmalu,ZKe60"hc/ei.eL2rdf,3fO"lc"3scdyor,"o,Ie"29nTg,mBH-,66toa"7Qod"oknJZrc,lyiVLg"oozQ3:g"aS/1y,t"B9/i,74"oJ8,nV1"Q5S4"8,20d-63733533ea7c",

Data Preparation

Dataverse sample data is stored at the following location.

https://github.com/artefactual/archivematica-sampledata/tree/master/SampleTransfers/Dataverse

Let’s store the JSON file downloaded from Dataverse as dataset.json in the metadata folder. Specifically, it looks like the following.

Here, referencing the following article, I prepared data in the mdx.jp object storage connected from GakuNin RDM, and processed it from Archivematica connected to the same object storage.

Processing in Archivematica

Set the Transfer type to “Dataverse”, select the folder created earlier, and start processing.

As a result, a METS file was created as follows. Whether it’s a problem with the data registration method or a bug is unclear, but dmdSec_1 was created twice. However, the contents of dataset.json were described in DDI format.

<me<<<<<mtmm/m/m/mesee<me<me<me<mt:ttm/etm/etmetmesmsse<mtse<mtsetset:e::tm/es:tm/es:ts:tsmtmdse<mt:dse<mt:ds:as:esem:td/esdm:td/esdm:dm:attdmsd<dt:mdmsd<dt:mdmmdtmsxsSd:id/dsmdSd:id/dsmdSddSed>mHeWx:d<<di:dSeWx:d<<di:dSeRSecSldcrmcid/d/d:xWecrmcid/d/d:xWeceechenralo:d<<<<dd<dicmrcalo:d<<<<dd<dicmrcfcMcsIpDdsid/d/d/d/did/d:ola>IpDdsid/d/d/d/did/d:ola>I>ID>:CDaet:d<<dd<dd<dd<di:d<disdDpDaet:d<<dd<dd<dd<di:d<disdDpDLDxR=Mtbdcidddiddiddidd:didd:tea>=Mtbdcidddiddiddidd:didd:tea>=A=IsE"Daoyi:ddi:di:di:dica:diddbt"Daoyi:ddi:di:di:dica:diddbt"B"DiAdT>oDttii:ri:di:vi:itui:ayoadT>oDttii:ri:di:vi:itui:ayoadEa==TmYksai::ts:ri:de:vtas:utDo>mYksai::ts:ri:de:vtas:utDo>mLm""EdPctttIipAssdirveaAersaskdPctttIipAssdirveaAersaskd=dthDSExriliDtSuptisSertcSeeAc>SExriliDtSuptisSertcSeeAc>S"SetAe=m>oStNlttSSsttrSictsScre=m>oStNlttSSsttrSictsScredectTc"lntloSmhtttSmstosmttc>c"lntloSmhtttSmstosmttc>cachpE_Dn>m>ttEmmrttimn>trms_Dn>m>ttEmmrttimn>trms_t_M:=1Dstnam>nttbm>ot>>ct>1Dstnam>nttbm>ot>>ct>2a1D/""I:>agtt>>ttn>t>"I:>agtt>>ttn>t>"s"_/2"dke>yr>n"dke>yr>ne>1w0C>dan>dC>dan>dCt"w2RimcaDaRimcaDaR.>w5E=uyfetE=uyfetEj..-A"r=fmeA"r=fmeAs.w0Tha"io=Tha"io=To.31Et1dl"Et1dl"En<.-Dt9oiD2Dt9oiD2D"/o2=p6iaa0=p6iaa0=mr1":<"tt2":<"tt2"xegT2//>ia52//>ia52lt/00/dhov-0/dhov-0is272wdtne02wdtne02n:0:5wit=r15wit=r15kt02-w:p"s--w:p"s--:e180.tshe20.tshe20hc/:1ii:t<01ii:t<01rhX1-ct/t/T-ct/t/T-eMM32pl/pd02pl/pd02fDL"1s>dsd11s>dsd11=>STro:i:Tro:i:T"c0.i/:30.i/:30mh7u./d07u./d07ee:mori::mori::tm2iros22iros22aa7cgrt37cgrt37d-:h/.rZ:h/.rZ:ai5.1ob"5.1ob"5tn7e0rt7e0rt7as"d.grt"d.grt"/tu7/>yu7/>ydaS/00pS/00pSanTD15eTD15eTtcAD27=AD27=AaeTI2z"TI2z"Ts"U"/hRU"/hRUeSF3ESF3EStx=vKyL=vKyL=.m"e29E"e29E"jlor/6Aor/6AosnrsI"SrsI"SrosiiH>EiiH>Ein:goQNDgoQNDg"xinZa"inZa"iln=Lk>n=Lk>nMia"3a1a"3a1aDnl2<m.l2<m.lTk"./u0"./u0"Y=>5dr<>5dr<>P""da/"da/Ehi,di,d=tx:dx:d"tsISisISiOpiDa:iDa:T::Ntv:NtvH/sooesooeE/c>rrc>rrRwhushus"we<ie<iwm/om/oL.adnadnOwLd>Ld>C3oioiT.c:c:YoaAaAPrtutuEgitit=/ohoh"1nEnEO9=n=nT9"t"tH9hyhyE/t>t>Rxtt"lppi::On//Tk//H"wwEwwRxwwLm..OlddCnddTsiiY:::PmccEeoo=tdd"seeS=bbY"ooShooTtkkEt::Mp22":__/55/whhwttwtt.ppl::o//c//.wwgwwowwv../ddMddEiiTaaSll/ll"iiaaxnnscciee:..soocrrhgge//mSSappLeeoccciiafftiiiccoaantt=ii"oohnnt//tDDpDD:II/--/CCwoowddwee.bblooooockk.//g22o..v55///MXXEMMTLLSSS/cchhheetmmtaap//:cc/oo/ddweewbbwoo.oolkko..cxx.ssgddo""v>>/standards/mets/version1121/mets.xsd">

Summary

I found this to be a very useful feature when considering long-term preservation of research data. I hope this serves as a helpful reference for connecting Dataverse and Archivematica.