AI on Demandpowered by BAAI

bge-m3 and bge-reranker-v2-m3 from BAAI: two specialised models for embedding and reranking, supporting over 100 languages and up to 8,192 tokens. stepping stone runs both on Swiss infrastructure — as the foundation for search and RAG architectures that are not dependent on the US.

stepping stone provides two models from the Beijing Academy of Artificial Intelligence (BAAI): bge-m3 for embedding and bge-reranker-v2-m3 for reranking.

The embedding model converts text into vectors, enabling powerful, cross-lingual search — across over 100 languages, from short queries to documents containing up to 8,192 tokens. The reranking model sorts results by relevance and improves the quality of search results.

Combined with one of stepping stone’s LLM offerings, this creates a Retrieval-Augmented Generation (RAG) pipeline: your internal data becomes available within the context of every query — accurately, efficiently and without leaving the country.

Development teams and companies looking to build intelligent search capabilities using internal data or to enhance existing AI applications with their own knowledge. Particularly suitable for organisations with multilingual document collections or compliance requirements.

Typical use cases: semantic search across knowledge databases and archives, RAG pipelines for chatbots and assistant systems, multilingual document search, and improving the quality of existing search solutions through reranking.

Open source (MIT / Apache 2.0). Swiss data centres. No data stored with US providers.

Embedding and reranking require significantly less computing power than an LLM that would have to perform the same task — this saves on tokens and costs. Modern API with comprehensive documentation and clear examples. Personalised advice and operation provided by stepping stone in Bern.

Scope of services

Embedding und Reranking auf Abruf

Access to bge-m3 for multilingual vector search and bge-reranker-v2-m3 for precise re-ranking. Over 100 languages, up to 8,192 tokens per document.

RAG-compatible infrastructure

Can be combined with stepping stone’s LLM offerings to create a complete RAG pipeline. Your internal data will be available in the context of each query, without ever leaving the country.

Managed service

Deployment, monitoring, maintenance and support on Swiss infrastructure, with personalised advice. stepping stone takes care of the day-to-day running so that you can focus on the benefits.

Areas of application

Semantic search

BAAI models enable a search that understands meaning — rather than simply matching keywords.

Companies use bge-m3 to make knowledge bases, document archives and internal repositories semantically searchable. The re-ranking model then improves the quality of search results through precise relevance weighting — multilingual, efficient and hosted on Swiss infrastructure.

RAG Pipelines

Embedding and reranking are the core components of any retrieval-augmented generation pipeline.

When combined with one of stepping stone’s LLM solutions, RAG systems are created that integrate internal data into every query in a contextually accurate manner. Compatible with LangChain and LlamaIndex — without leaving the country.

Price

ModelContext lengthMTok
bge-m38k0.1000
bge-reranker-v2-m38k0.1000
All prices are in CHF/MTok, excluding VAT.