Investigation date: 2026-02-24 Target: GakuNin RDM (GRDM) Search API Source code: RCOSDP/RDM-osf.io (
website/search/directory) Developer guide: RCOSDP/RDM-developer-guide Note: Official documentation for the Search API could not be found. This article is an investigation record based on both the actual API behavior and the source code.
Overview
GakuNin RDM is a fork of OSF (Open Science Framework), and its source code is available on GitHub (RCOSDP/RDM-osf.io). The search functionality implementation is in the website/search/ directory and consists mainly of the following files.
| File | Role |
|---|---|
elastic_search.py | Index mapping definitions, document registration/update |
views.py | API endpoint handlers |
util.py | Query construction including build_private_search_query() |
search.py | High-level interface |
In Japanese environments, Elasticsearch’s kuromoji_analyzer is used (confirmed in the source code).
Request Format
| Parameter | Description | Notes |
|---|---|---|
api_version | vendor: "grdm", version: 2 | version supports 1 and 2 (confirmed in source code) |
elasticsearch_dsl.query | Elasticsearch Query DSL | Uses filtered format (ES 2.x syntax) |
from / size | Pagination | Confirmed to work up to size=100. match_all + size>50 results in a 500 error |
highlight | field_name:character_count format | GRDM-specific format. Wildcards (comments.*) can also be used |
sort | Sort order | Described below |
Sort Options
According to the source code (build_private_search_query in util.py), the following sort targets are defined.
| Sort value | Verified | Description |
|---|---|---|
modified_desc / modified_asc | Verified | By modification date |
created_desc / created_asc | Verified | By creation date |
project_desc / project_asc | Unverified | By project name (defined in source) |
file_desc / file_asc | Unverified | By file name (defined in source) |
wiki_desc / wiki_asc | Unverified | By wiki name (defined in source) |
user_desc / user_asc | Unverified | By user name (defined in source) |
institution_desc / institution_asc | Unverified | By institution name (defined in source) |
relevance,title_asc, etc. are not defined in the source and were confirmed to result in 400 errors.
Fields Indexed (From Source Code)
From update_file(), update_node(), and update_user() in the source code elastic_search.py, the fields indexed for each category can be confirmed.
file (update_file())
| Field | Source | _all search | Notes |
|---|---|---|---|
name | file_.name | Hits | Highlight confirmed |
normalized_name | unicode_normalize(file_.name) | – | For kuromoji |
node_title | target.title | Hits | Parent project name |
creator_name | Obtained from user info | Hits | |
modifier_name | Obtained from user info | Unverified | |
tags | File tags | Hits | Highlight confirmed |
normalized_tags | Normalized tags | Hits | Highlight confirmed |
extra_search_terms | clean_splitters(file_.name) | Unverified | File name tokenized |
comments | comments_to_doc() | Unverified | No test data available |
node_public | target.is_public | – | For filtering |
node_contributors | Contributor ID list | – | For permission filtering |
deep_url | File URL | – | For display |
date_created / date_modified | Datetime | – | For sorting |
category | Fixed as "file" | – |
folder_nameandparent_title/parent_urlare not included in the index. These are dynamically added during response byformat_results()(confirmed in source code).
project (update_node())
| Field | Source | _all search | Notes |
|---|---|---|---|
title | node.title | Hits | |
normalized_title | Normalized title | – | For kuromoji |
description | node.description | Hits | Highlight confirmed |
normalized_description | Normalized description | – | For kuromoji |
tags | Project tags | Hits | |
normalized_tags | Normalized tags | – | |
contributors | Contributor info | – | |
creator_name | Creator name | Hits | |
comments | Comments | Unverified | |
wikis | Wiki content | Unverified | Mapped via dynamic template |
license | License info | – | For display |
affiliated_institutions | Affiliated institutions | – | |
boost | Boost value | – |
user (update_user())
| Field | Source | Notes |
|---|---|---|
user | user.fullname | |
normalized_user / normalized_names | Normalized names | |
job / job_title | Job info | |
ongoing_job / ongoing_job_department / ongoing_job_title | Current workplace | |
school / ongoing_school* | Education history | |
emails | Email addresses | |
social | SNS links | |
boost | Fixed at 2 | Set to boost user search scores |
Other Categories (Defined in Source Code)
According to the source code, the following categories exist in addition to file, project, and user.
component– Project sub-componentsregistration– Registrations (snapshots)preprint– Preprintswiki– Wiki pages (textfield contains the body)comment– Commentsinstitution– Institutionscollectionsubmission– Collection submissions
The
texthighlight field is most likely for Wiki page content (not file body text).
_all Field Search Targets
Elasticsearch Mapping (From Source Code)
The _all field is analyzed with kuromoji_analyzer. In create_index() of the source code, analyzers are configured for each field, and fields with analyzers configured are included in _all.
Search Targets Confirmed by Experiment
| Search term | Matched field | Category | Confirmed via highlight |
|---|---|---|---|
"2507" | name (file name) | file | highlight[name] |
"dmp-project-aaa" | node_title (project name) | file | – |
"Nakamura" | creator_name (creator name) | file | – |
"blockchain" | tags (tags) | file | highlight[tags] |
"arxiv" | tags / normalized_tags | file | highlight[tags], highlight[normalized_tags] |
"アーカイブズ学" | description (project description) | project | highlight[description] |
"digital-preservation" | tags (project tags) | project | – |
Confirmed Not Searchable
| Search term | Target field | Category | Reason |
|---|---|---|---|
"NII Storage" | folder_name | file | Not included in index (dynamically added by format_results()) |
"digital preservation framework" | file_description | file | DataCite metadata is not searchable |
"Clio-X" | .txt file body | file | File body text is not indexed |
"Victoria Lemieux" | .txt file body | file | Same as above |
Filters
| Filter format | Result |
|---|---|
{"term": {"category": "file"}} | Works |
{"and": [{"term": ...}, {"term": ...}]} | Works |
{"bool": {"should": [...]}} | Works |
{"bool": {"must": [...]}} | Results in 500 error |
The reason
bool+mustfails is unknown. Sincebool+mustis used withinbuild_private_search_query()in the source code, this is presumed to be a limitation on the API wrapper side. As a workaround,andfilters orquery_stringAND syntax can be used.
Highlights
Confirmed
| Field | Category | Search term used for confirmation |
|---|---|---|
name | file | "ip2" |
tags | file | "blockchain", "arxiv" |
normalized_tags | file | "arxiv" |
description | project | "アーカイブズ学" |
Unverified
text– According to the source code, Wiki page body text goes into this field. It should be testable with a project that has a Wiki.comments.*– Dynamic field for comments. Unverified because there were no comments in the test data.title– Project title. Can be tested with search terms that match the title.user– User name.
DataCite Metadata
Editable Fields
Managed via /v2/files/{id}/metadata_records/. However, the fields that OSF allows editing are limited to the following four (confirmed in the validation schema).
| Field | Type | Description |
|---|---|---|
resource_type | enum | Audio/Video, Dataset, Image, Model, Software, Book, Funding Submission, Journal Article, Lesson, Poster, Preprint, Presentation, Research Tool, Thesis, Other |
file_description | string | File description text |
related_publication_doi | string | DOI of related publication (10.xxxx/yyyy format) |
funders | array | [{"funding_agency": "...", "grant_number": "..."}] |
All fields of the DataCite v4.0 schema (
titles,creators,subjects,descriptions, etc.) are defined, but due to OSF’s input validation, fields other than the above four result inAdditional properties are not allowederrors.
Relationship to Search
I set a value in file_description and searched for it, but it did not hit. There is also no processing to read DataCite metadata in update_file() in the source code, and DataCite metadata is not included in the search index.
Summary
What Was Confirmed
API behavior (experiments):
/api/v1/search/is an Elasticsearch Query DSL-based search API- It can perform cross-category searches across
file,project, anduser _allfull-text search hits onname,node_title,creator_name,tags, anddescription- Field-specific queries (
name:,tags:,category:), wildcards, and AND/OR operators can be used term/and/bool+shouldfilters worksortwas confirmed to work with 4 types:modified_desc/ascandcreated_desc/ascsizewas confirmed to work up to 100- Highlights were confirmed to work for
name,tags,normalized_tags, anddescription
Confirmed from source code:
kuromoji_analyzeris used in Japanese environmentsfolder_name,parent_title, andparent_urlare not included in the index and are dynamically added during response- The
texthighlight field is for Wiki page content sorthas definitions for each direction ofproject,file,wiki,user,institution,created, andmodifiedapi_versionsupportsversion1 and 2, andvendorsupports"grdm"- In addition to
file,project, anduser, categoriescomponent,registration,preprint,wiki,comment, andinstitutionexist
Confirmed Not Searchable
- Text file body (text within
.txtfiles) – Confirmed by both experiment and source code - DataCite metadata (
file_description, etc.) – Confirmed by both experiment and source code folder_name(storage provider name) – Confirmed by both experiment and source code
Unverified
- Whether
commentshit in_allsearch - Wiki content (
textfield) search behavior - Behavior of sort values like
project_desc - Exact upper limit of
size - Cause of 500 error with
bool+mustfilter
Reference Links
- RCOSDP/RDM-osf.io – GakuNin RDM source code
- RCOSDP/RDM-developer-guide – Developer guide
- GakuNin RDM Support Portal Search – User search manual
- website/search/elastic_search.py – Index definitions and document registration
- website/search/views.py – API endpoints
- website/search/util.py – Query construction and sort definitions