Bulbasaur

Mihaiii/Bulbasaur

published Apr 2024 · updated Apr 2024

Bulbasaur is a distilled embed model for semantic-autocomplete, based on gte-tiny and fine-tuned on the qa-assistant dataset.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

200

license

mit

specs

Task	Embeddings
Architecture	Distilled BERT (gte-tiny base)
Parameters	22.7M
Embedding Dimension	384
Max Tokens	512
License	Unknown

about this model

Mihaiii/Bulbasaur is an embedding model that converts text into dense vector representations, optimized for semantic-autocomplete and related similarity search tasks. It is a distilled version of gte-tiny, fine-tuned on the qa-assistant dataset—a curated collection of 7,174 question-answer pairs (5,768 training, 1,406 test) with associated relevance scores.

The underlying gte-tiny architecture uses a BERT model with 22.7 million parameters, producing 384-dimensional embeddings via mean pooling. According to the base model’s documentation, gte-tiny achieves performance comparable to thenlper/gte-small at roughly half the model size, making Bulbasaur a compact and efficient choice for retrieval and ranking pipelines.

The model accepts English text only and truncates inputs longer than 512 tokens. It is hosted on gigarouter as a managed, OpenAI-compatible API—no local installation or inference code is required.

best for

·Semantic autocomplete for search bars
·Lightweight sentence embedding for English text

FAQ

What is Bulbasaur best used for?

It is designed for semantic-autocomplete, such as suggesting completions in search bars based on meaning.

What is the embedding dimension and max token length?

It produces 384-dimensional embeddings and supports up to 512 tokens per input.

How does Bulbasaur compare to gte-tiny in size?

Bulbasaur is a distilled version of gte-tiny, which itself has 22.7M parameters and is about half the size of gte-small.

What languages does Bulbasaur support?

It exclusively supports English text.

How can I call Bulbasaur via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key to send text and receive embeddings.

not yet live

We're benchmarking and onboarding Bulbasaur as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5