Japanese Reranker Cross-Encoder Small v1
hotchpotch/japanese-reranker-cross-encoder-small-v1
published Mar 2024 · updated Jul 2025
Japanese Reranker Cross-Encoder Small v1 is a rerank model that scores the relevance between a Japanese query and passages using a Cross-Encoder architecture.
specs
| Task | Reranking (CrossEncoder) for Japanese |
| Architecture | Cross-Encoder (12 layers, hidden size 384) |
| License | MIT |
| Pre-trained Base | Microsoft mMiniLMv2-L12-H384 |
about this model
hotchpotch/japanese-reranker-cross-encoder-small-v1 is a Japanese cross-encoder reranker model that scores the relevance of a query–passage pair, enabling reordering of initial retrieval results to improve search accuracy. It is part of a family of Japanese rerankers built on Microsoft mMiniLMv2-L12-H384 (12 layers, 384 hidden size). The model was trained using knowledge distillation from larger teacher models (japanese-reranker-cross-encoder-large-v1 and japanese-bge-reranker-v2-m3-v1) and a mix of training datasets: JQaRA, JSQuAD, MIRACL, mMARCO, Mr. TyDi, and Wikipedia lead paragraphs. Training was limited to one epoch to prevent overfitting to Wikipedia-based data. The learning rate was 5e-04 with a batch size of 512 (gradient accumulation) and a cosine scheduler.
Benchmark Results
Evaluation uses NDCG@10 on four Japanese retrieval benchmarks. Scores are compared across several models:
| Model | JQaRA | JaCWIR | MIRACL | JSQuAD |
|---|---|---|---|---|
| japanese-reranker-cross-encoder-small-v1 | 0.6247 | 0.939 | 0.7776 | 0.9604 |
| xsmall-v1 | 0.6136 | 0.9376 | 0.7411 | 0.9602 |
| base-v1 | 0.6711 | 0.9337 | 0.818 | 0.9708 |
| large-v1 | 0.7099 | 0.9364 | 0.8406 | 0.9773 |
| bge-reranker-v2-m3 (Japanese) | 0.6918 | 0.9372 | 0.8423 | 0.9624 |
The small variant offers a balanced trade-off between accuracy and inference speed. On the JaCWIR benchmark using an RTX3090, inference took 265 seconds, compared to 196 seconds for the xsmall, 481 seconds for the base, and 1,253 seconds for the large model. This makes it suitable for latency-sensitive applications requiring strong Japanese reranking performance.
best for
- ·Re-ranking Japanese search engine results
- ·Improving accuracy of Japanese question-answering retrieval
FAQ
Query-passage pairs as text strings; the model outputs a relevance score between 0 and 1 using a sigmoid activation.
It has 12 layers (vs 24 for large) and 384 hidden size (vs 1024), offering faster inference with competitive accuracy on Japanese benchmarks.
MIT License, as stated in the model card.
Use the OpenAI-compatible endpoint with your gigarouter API key; refer to the gigarouter documentation for endpoint details.
Yes, the small variant used the large and BGE reranker models as teachers with MSE loss for additional training signal.
We're benchmarking and onboarding Japanese Reranker Cross-Encoder Small v1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.