Introduction
Omeka-S is a powerful digital archive system, but by default, Japanese full-text search barely functions. This article explains how to achieve Japanese full-text search by introducing the MroongaSearch module.
Why the MroongaSearch Module is Essential
Problems with Omeka-S Standard Search
Omeka-S’s standard full-text search (FullTextSearch module) uses the InnoDB engine, which has the following critical issues:
Example of Japanese word search:
Since InnoDB’s full-text search assumes space-delimited languages like English, for Japanese:
- Word search is impossible: The entire string is treated as a single word
- Partial matching also fails: FULLTEXT indexes cannot process Japanese correctly
- Zero search results: Users cannot find anything
The MroongaSearch Module Solution
The MroongaSearch module solves this problem in two stages:
1. Fallback Feature (Active Immediately After Module Installation)
Important: Simply installing the MroongaSearch module enables Japanese search to work without any special configuration.
The MroongaSearch module’s fallback feature:
- Automatically detects CJK (Japanese, Chinese, Korean) single-word searches
- Automatically falls back to
LIKE '%term%'search - Works even without Mroonga configured
- Without this, Japanese full-text search simply does not work properly
2. Fast and Precise Search with Mroonga + TokenMecab (Recommended)
Furthermore, by configuring the Mroonga plugin for MariaDB:
- Precise word search through morphological analysis
- Fast full-text search (hundreds of times faster than LIKE)
- Strict control of AND/OR search
What is the MroongaSearch Module?
MroongaSearch is a full-text search enhancement module for Omeka-S.
Main Features
Automatic fallback feature
- Enables CJK search even without Mroonga configured
- Automatic switching to LIKE search
- Ready to use immediately without configuration
Mroonga integration
- Precise search through morphological analysis
- TokenMecab support
- Fast index-based search
Diagnostic page
- Plugin status verification
- Table engine display
- Tokenizer information
- Manual engine switching
Strict AND/OR search
- More precise search logic than standard FullTextSearch
Developers
- Kentaro Fukuchi (initial version)
- Kazufumi Fukuda (feature extensions)
- Toshihito Waki (current maintainer)
Setup Procedure
Step 1: Install the MroongaSearch Module
Activate the module from the Omeka-S admin panel.
Japanese search will work with just this! (via LIKE search fallback)
Step 2: Build the Mroonga Environment (Recommended)
For faster and more precise search, configure the Mroonga plugin for MariaDB.
For Docker Environments
Directory structure:
mariadb/Dockerfile:
mariadb/init.sql:
docker-compose.yml (mariadb section):
Rebuilding containers:
Step 3: Verify the Setup
1. Verify the Mroonga Plugin
Expected output:
2. Verify TokenMecab
0
Expected output (excerpt):
1
If TokenMecab is included, you are all set.
3. Check the MroongaSearch Diagnostic Page
In the Omeka-S admin panel:
2
Displayed information:
- Plugin status: ACTIVE / NOT ACTIVE
- Table engine: InnoDB / Mroonga
- Tokenizer: TokenMecab / None
- Mroonga effective: YES / NO
If “Mroonga effective: NO”:
- Plugin is ACTIVE but the table engine remains InnoDB
- Fallback search (LIKE) is used
- It works but is slow

To set “Mroonga effective: YES”:
- Manually switch the engine to Mroonga from the diagnostic page


- Or change directly via SQL:
3
4. Re-index
Run re-indexing from the diagnostic page or the Omeka-S admin panel.

How Search Works
Without Mroonga Configured (Fallback)
4
With Mroonga + TokenMecab Configured
5
Substring Search Also Works
Mroonga supports not only morphological analysis but also substring search:
6
This allows users to get results even when they do not know the exact words.
Morphological Analysis with TokenMecab
What is Morphological Analysis?
Since Japanese does not have space-delimited words like English, sentences need to be segmented into words.
Example:
7
This enables word-level searching for terms like “Tokyo” and “university.”
Limitations of Morphological Analysis
TokenMecab is powerful, but may not work as expected in the following cases:
1. Proper Nouns (New Words Not in the Dictionary)
8
2. Compound Words and Technical Terms
9
3. Coined Words and Neologisms
0
4. Multiple Segmentation Patterns
1
Solutions
- User dictionary: Add custom words to the MeCab dictionary
- TokenBigram: Supplement partial matches with 2-character N-grams
- Fallback: MroongaSearch automatically uses LIKE search as well
Available Tokenizers
| Tokenizer | Description | Use Case |
|---|---|---|
| TokenMecab | Morphological analysis | Japanese search (recommended) |
| TokenBigram | Split into 2-character units | Emphasis on partial matching |
| TokenUnigram | Split into single characters | Exact matching only |
| TokenDelimit | Split by delimiters | English, etc. |
Performance Comparison
LIKE Search (Fallback)
2
- Full table scan
- Delay proportional to data volume
- However, search results are returned (zero without the module)
Mroonga Full-Text Search
3
- Uses indexes
- Fast search (hundreds of times faster than LIKE)
- Scalable
Summary
Importance of the MroongaSearch Module
- Essential: The MroongaSearch module is essential for Japanese full-text search in Omeka-S
- Immediate effect: Search is possible immediately after installation via the fallback feature
- Incremental improvement: Further speed improvement with Mroonga configuration
Recommended Setup
| Level | Configuration | Search Behavior | Performance |
|---|---|---|---|
| Minimum | MroongaSearch module only | LIKE search fallback | Slow (but works) |
| Recommended | MroongaSearch module + Mroonga + TokenMecab | Morphological analysis search | Fast |
Benefits of Implementation
- Japanese search becomes possible: Immediately functional via fallback
- Improved accuracy: Word-level search with TokenMecab
- Speed improvement: Optimization with the Groonga engine
- Flexibility: Both morphological search and partial matching
Conclusion: When handling Japanese content in Omeka-S, the MroongaSearch module is essential.
Reference Links
- MroongaSearch GitHub
- Mroonga Official Site
- MeCab Official Site
- Omeka-S Official Documentation
- MariaDB Mroonga Plugin
Testing Environment
- Omeka-S: 4.1.1
- MroongaSearch: latest
- MariaDB: latest (11.x)
- Docker Compose
- macOS (Darwin 24.6.0)
If you found this article helpful, please star the GitHub repository!