AI on Demandpowered by Ai2

olmOCR by Ai2 (Allen Institute for AI): a vision-language model that outperforms traditional OCR systems in areas where tables, equations and poor-quality scans cause them to fail. Powered by stepping stone on Swiss infrastructure.

The olmOCR vision-language model from Ai2 (Allen Institute for AI) specialises in optical character recognition. It recognises text in documents that pose challenges for traditional OCR systems: complex tables, mathematical formulae, multi-column layouts and poorly scanned documents.

stepping stone runs olmOCR entirely on Swiss infrastructure. Access is via an OpenAI-compatible API, which can be integrated directly into existing document workflows. Your documents remain in Switzerland.

Companies, public authorities and educational institutions that need to digitise large volumes of documents or make them searchable — without transferring data to US providers. Particularly suitable for regulated sectors handling sensitive documents.

Typical applications: digitisation of archives and legacy collections, automated invoice and contract processing, extraction of tables and financial data, processing of academic documents containing formulae.

Open Source (Apache 2.0). Schweizer Rechenzentren. Keine Daten bei US-Anbietern.

Wo klassische OCR an Tabellen, Formeln oder schlechten Scans scheitert, liefert olmOCR zuverlässige Ergebnisse. Persönliche Beratung von stepping stone — von der Integration bis zur Skalierung. Betrieb aus Bern.

Scope of services

On-demand document recognition

Access to olmOCR for accurate text recognition in documents. Particularly effective with tables, mathematical formulae, multi-column layouts and poorly scanned documents.

GPU performance on demand

Scalable computing power for processing individual documents or entire archives. You pay as you go.

Managed service

Deployment, monitoring, maintenance and support on Swiss infrastructure with personalised advice. stepping stone takes care of operations so you can focus on the benefits.

Areas of application

Archives & Compliance

olmOCR makes document archives searchable — even if the source documents are of poor quality.

Public authorities and businesses use it to digitise legacy records, files and regulatory documents. As all data remains on Swiss infrastructure, it is particularly suitable for sensitive documents in regulated sectors.

Data extraction

Tables, financial data and scientific formulas can be extracted from documents in a structured format using olmOCR.

Invoices, contracts and academic publications are processed automatically and converted into machine-readable formats. The results can be fed directly into downstream workflows, databases or RAG pipelines.Schweizer Infrastruktur.

Benchmark

The benchmark processes 50 CVs (100 pages total). Step-by-step instructions and the required Python script can be downloaded from GitHub.

If necessary, higher concurrency and page limits can be set.

Call

# Set your personal key:
STONEY_KEY=sk-...

# Make key visible for bench script:
export OPENAI_API_KEY=$STONEY_KEY

# Start the benchmark
python cv_bench_endpoint.py \
 --endpoint llm.stoney-cloud.com/v1/chat/completions \
 --data cv_bench_data \
 --model "allenai/olmOCR-2-7B" \
 --api-key $STONEY_KEY \
 --concurrency 1 \
 --limit 100

Result

concurrency   : 1
requested     : 50
ok            : 50
failed        : 0
duration_s    : 193.5
pages_s       : 0.26
pages_min     : 15.5
out_tok_s     : 140
latency_p50_s : 3.73
latency_p99_s : 8.67

 

Legend

  • concurrency: How many requests the model processes simultaneously.
  • requested: How many requests were sent.
  • ok: Number of accepted requests (in this case, CVs).
  • failed: Number of rejected requests.
  • duration_s: The duration of the benchmark run.
  • pages_s: The average number of pages that can be processed per second.
  • pages_min: The average number of pages that can be processed per minute.
  • out_tok_s: The number of tokens generated per second.
  • latency_p50_s: The average response time in seconds.
  • latency_p99_s: The response time required in the “worst case” scenario, in seconds.

Price

ModelContext lengthInput/MTokOutput/MTok
olmOCR-2-7B8k0.06000.2900
All prices are in CHF/MTok, excluding VAT.