Introduction

This article explains the procedure for uploading TEI (Text Encoding Initiative) format XML files to GitHub and creating URLs that anyone can access.

TEI/XML is an international standard format for structurally describing texts such as historical documents and literary works. By using GitHub, you can share your research data with researchers around the world.

What You Need

  • A computer (Windows, Mac, or Linux)
  • Internet connection
  • TEI/XML files (that you already have)
  • Email address (for creating a GitHub account)

About Sample Files

If you don’t have TEI/XML files, you can use the following TEI/XML file from the Koui Genji Monogatari for practice:

Sample File URL:

https://raw.githubusercontent.com/kouigenjimonogatari/kouigenjimonogatari.github.io/master/tei/01.xml

How to download this file:

  1. Open the above URL in your browser
  2. Right-click and select “Save as”
  3. Set the filename to something like “koukin_genji_01.xml” and save

Step 1: Creating a GitHub Account

1-1. Access the GitHub Website

  1. Open a browser (Chrome, Firefox, Safari, etc.)
  2. Type https://github.com in the address bar and press Enter

1-2. Account Creation

  1. Click the “Sign up” button in the upper right

  2. Enter the following information:

    • Email address: Your regular email address
    • Password: A secure password (8 or more characters, including numbers and symbols)
    • Username: A name that doesn’t conflict with others (e.g., yamada-taro2024)
  3. Click “Create account”

  4. Enter the verification code sent to your email

Hint: Your username cannot be changed later, so choose carefully. If using it as a researcher, a name based on your real name is recommended.


Step 2: Creating a New Repository

A repository is like a “project folder” for storing files.

2-1. Navigate to Repository Creation

  1. While logged into GitHub, click the “+” icon in the upper right
  2. Select “New repository”

2-2. Repository Settings

Enter/select the following items:

Repository name

  • Alphanumeric characters and hyphens (-) can be used
  • Examples: tei-xml-collection, medieval-texts-tei

Description (optional)

  • A brief description of the project
  • Example: “Collection of TEI/XML files for medieval texts”

Public/Private

  • Select Public: Viewable by anyone (recommended)
  • Private: Viewable only by invited people

Initialize this repository with:

  • Check Add a README file
  • Other options can be left as default
  1. Click the green “Create repository” button

Step 3: Uploading TEI/XML Files

3-1. Navigate to Upload Screen

  1. After the repository is created, the file list screen is displayed
  2. Click the “Add file” button
  3. Select “Upload files”

3-2. File Upload

Method A: Drag & Drop

  1. Open Explorer (Windows) or Finder (Mac) to the location of your TEI/XML files
  2. Select the files you want to upload (multiple selection is possible)
  3. Drag and drop them into the dotted border area in the browser

Method B: File Selection

  1. Click “choose your files”
  2. Select the TEI/XML files in the file selection dialog
  3. Click “Open”

Example: Uploading a TEI/XML File of the Koui Genji Monogatari

A concrete example using the sample file:

  1. Select the previously downloaded “koukin_genji_01.xml”
  2. The file size is small (about 100KB), so it uploads quickly
  3. If the filename is in Japanese, it is recommended to change it to alphanumeric characters (e.g., genji_chapter01.xml)

3-3. Confirming the Upload

  1. Confirm that the file has been uploaded
  2. In the Commit changes section:
    • Enter a brief description in the first input field (e.g., “Add TEI/XML files”)
    • Detailed description is optional
  3. Click the green “Commit changes” button

Hint: You can also upload entire folders. The folder structure is preserved during upload.


Step 4: Getting the File Reference URL

4-1. Opening the File

  1. Click the TEI/XML file you want to share from the repository file list
  2. The file contents will be displayed

4-2. How to Get the URL

Convenient for processing with programs:

  1. Click the “Raw” button in the upper right of the file display screen
  2. Copy the URL from the newly opened tab
  3. Example: https://raw.githubusercontent.com/[username]/[repository-name]/main/[filename].xml

Method 2: Browser Display URL

You can share this URL directly:

  • Copy the URL displayed in the browser’s address bar
  • Example: https://github.com/[username]/[repository-name]/blob/main/[filename].xml

Method 3: Download URL

  1. Right-click the “Download” button in the upper right of the file display screen
  2. Select “Copy link address”

4-3. Display Examples in TEI/XML Viewers

TEI/XML files published on GitHub can also be displayed in a user-friendly way using specialized viewers.

Viewer 1: TEI Classical Text Viewer

https://candra.dhii.jp/nagasaki/tei/tei_viewer/?file=[yourGitHubRawURL]

Example:

https://candra.dhii.jp/nagasaki/tei/tei_viewer/?file=https://raw.githubusercontent.com/kouigenjimonogatari/kouigenjimonogatari.github.io/master/tei/01.xml

By using the Raw URL of files uploaded to GitHub, you can integrate with various external tools. Choose the optimal viewer according to your research purposes and audience needs.


Step 5: Organizing Multiple Files (Advanced)

5-1. Creating Folders

You cannot create folders directly on GitHub, but you can create folder structures by including slashes (/) in filenames.

  1. Select “Add file” -> “Create new file”
  2. Enter “folder-name/filename” in the filename input field
    • Example: manuscripts/text001.xml
  3. Enter or paste the content and click “Commit new file”

5-2. Moving Existing Files

  1. Click the file you want to move to open it
  2. Click the pencil icon (Edit)
  3. Add the folder name before the filename
    • Before: text001.xml
    • After: manuscripts/text001.xml
  4. Click “Commit changes”

Frequently Asked Questions (FAQ)

Q1. Is there a file size limit for uploads?

A: Uploads via browser are limited to 25MB per file. For larger files, you need to use a feature called Git LFS.

Q2. How do I delete a file that was uploaded by mistake?

A:

  1. Open the file you want to delete
  2. Click the trash can icon
  3. Enter a reason for deletion and click “Commit changes”

Q3. Can I make a private repository public later?

A:

  1. Open the “Settings” tab on the repository page
  2. Go to the “Danger Zone” section at the bottom
  3. Click “Change visibility”
  4. Select “Make public”

Q4. Can other researchers co-edit?

A:

  1. Go to “Settings” -> “Manage access” in the repository
  2. Click “Invite a collaborator”
  3. Enter the collaborator’s GitHub username or email address

Q5. How can I display TEI/XML files nicely?

A: GitHub basically displays XML as text, but you can make it more readable with the following methods:

  • Install browser extensions (such as XML Tree)
  • Publish as a website using GitHub Pages (requires separate configuration)

Security and Privacy Considerations

Checklist Before Publishing

  • Does it contain personal information (email addresses, phone numbers, etc.)?
  • Are there any copyright issues?
  • Is there any research ethics concern with publishing?
  • Have co-researchers given their consent?

Recommendations

  1. Specify a license: Include usage conditions in the README.md file
  2. Regular backups: Keep copies locally as well
  3. Use change history: GitHub records all changes

For Those Who Want to Learn More

Next Steps

  1. GitHub Pages: Publish TEI/XML as a website
  2. XSLT Transformation: Convert XML to HTML for readable display
  3. DOI Acquisition: Link with Zenodo to enable academic citation

Summary

This guide explained the basic procedure for publishing TEI/XML files on GitHub.

Main steps:

  1. Create a GitHub account
  2. Create a repository
  3. Upload files
  4. Get sharing URLs
  5. Organize files

By using GitHub, you can easily store, share, and version-control your research data. It may feel unfamiliar at first, but you will appreciate its convenience as you use it.

If you have any questions, please refer to GitHub’s help documentation or ask in the Digital Humanities community.