Japanese Reranker Cross-Encoder Small v1

hotchpotch/japanese-reranker-cross-encoder-small-v1

published Mar 2024 · updated Jul 2025

Japanese Reranker Cross-Encoder Small v1 is a rerank model that scores the relevance between a Japanese query and passages using a Cross-Encoder architecture.

est. price

~$0.008

/ 1k docs · estimated, set at launch

API providers

downloads / mo

334.2K

license

mit

specs

Task	Reranking (CrossEncoder) for Japanese
Architecture	Cross-Encoder (12 layers, hidden size 384)
License	MIT
Pre-trained Base	Microsoft mMiniLMv2-L12-H384

about this model

hotchpotch/japanese-reranker-cross-encoder-small-v1 is a Japanese cross-encoder reranker model that scores the relevance of a query–passage pair, enabling reordering of initial retrieval results to improve search accuracy. It is part of a family of Japanese rerankers built on Microsoft mMiniLMv2-L12-H384 (12 layers, 384 hidden size). The model was trained using knowledge distillation from larger teacher models (japanese-reranker-cross-encoder-large-v1 and japanese-bge-reranker-v2-m3-v1) and a mix of training datasets: JQaRA, JSQuAD, MIRACL, mMARCO, Mr. TyDi, and Wikipedia lead paragraphs. Training was limited to one epoch to prevent overfitting to Wikipedia-based data. The learning rate was 5e-04 with a batch size of 512 (gradient accumulation) and a cosine scheduler.

Benchmark Results

Evaluation uses NDCG@10 on four Japanese retrieval benchmarks. Scores are compared across several models:

Model	JQaRA	JaCWIR	MIRACL	JSQuAD
japanese-reranker-cross-encoder-small-v1	0.6247	0.939	0.7776	0.9604
xsmall-v1	0.6136	0.9376	0.7411	0.9602
base-v1	0.6711	0.9337	0.818	0.9708
large-v1	0.7099	0.9364	0.8406	0.9773
bge-reranker-v2-m3 (Japanese)	0.6918	0.9372	0.8423	0.9624

The small variant offers a balanced trade-off between accuracy and inference speed. On the JaCWIR benchmark using an RTX3090, inference took 265 seconds, compared to 196 seconds for the xsmall, 481 seconds for the base, and 1,253 seconds for the large model. This makes it suitable for latency-sensitive applications requiring strong Japanese reranking performance.

best for

·Re-ranking Japanese search engine results
·Improving accuracy of Japanese question-answering retrieval

FAQ

What input format does this model expect?

Query-passage pairs as text strings; the model outputs a relevance score between 0 and 1 using a sigmoid activation.

How does this model compare to the larger versions in the series?

It has 12 layers (vs 24 for large) and 384 hidden size (vs 1024), offering faster inference with competitive accuracy on Japanese benchmarks.

What is the license for this model?

MIT License, as stated in the model card.

How can I call this model via gigarouter API?

Use the OpenAI-compatible endpoint with your gigarouter API key; refer to the gigarouter documentation for endpoint details.

Was this model trained with knowledge distillation?

Yes, the small variant used the large and BGE reranker models as teachers with MSE loss for additional training signal.

not yet live

We're benchmarking and onboarding Japanese Reranker Cross-Encoder Small v1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →

ms-marco-MiniLM-L6-v2

81.5M dl/mo · live

ms-marco-MiniLM-L4-v2

4.8M dl/mo

gte-reranker-modernbert-base

2.7M dl/mo

ms-marco-MiniLM-L12-v2

2.3M dl/mo

jina-reranker-v2-base-multilingual

1.8M dl/mo · live

Qwen3-Reranker-4B

1.8M dl/mo