Best free embedding model for multilingual RAG? Comparing the options in the catalog
Best free embedding model for multilingual RAG? Comparing the options in the catalog
I'm building a customer support RAG system that needs to handle Polish, German, and English queries against a mixed-language knowledge base. I've been testing free embedding models from the catalog and wanted to share what I've found — and hear what others have experienced.
What I've tested
all-minilm-l6-v2-embeddings — this is my current default for English. Extremely fast (384 dims, CPU-friendly), great quality for intra-domain search. But on Polish and German queries against Polish/German documents, cross-lingual retrieval is noticeably worse. It wasn't designed for multilingual use.
For multilingual work I've been looking at paraphrase-multilingual-MiniLM-L12-v2 and intfloat/multilingual-e5-small — both available as OSS models, though not yet listed on the catalog as separate entries (hint hint to anyone who wants to publish them!).
My preliminary benchmarks (200-question eval per language)
| Model | EN top-3 recall | PL top-3 recall | DE top-3 recall | Embed speed (CPU) |
|---|---|---|---|---|
| all-MiniLM-L6-v2 | 78% | 51% | 58% | 2,100 docs/min |
| multilingual-MiniLM-L12-v2 | 72% | 71% | 73% | 980 docs/min |
| multilingual-e5-small | 74% | 74% | 76% | 870 docs/min |
Takeaway: for purely English RAG, all-MiniLM is hard to beat. For multilingual, multilingual-e5-small is slightly ahead but also slowest. The speed difference matters for large corpora — I'm indexing ~150k documents.
Questions for the community
- Has anyone used LaBSE (Language-agnostic BERT Sentence Embeddings) for multilingual RAG? It handles 100+ languages but is much heavier.
- For low-resource languages (Polish isn't tiny but it's not English), does fine-tuning a base multilingual model on domain data actually move the needle meaningfully?
- Is anyone combining a multilingual embedding layer with a cross-encoder reranker for the second stage? That's my next experiment.
All of the models I'm comparing are free — the question is just which free option is best for the job. Would love to hear from anyone doing non-English RAG in production.