About route-switch | route-switch

route-switch is a single-binary Go service from Skelf Research. It plays two roles at once: it's an OpenAI-compatible inference endpoint your application calls, and it's a closed-loop optimizer that rewrites the prompt templates it serves based on the traffic it has already seen.

Why this exists

Most teams running LLMs in production end up with three overlapping problems and no single tool that addresses them:

Lock-in to one provider. Switching from OpenAI to Anthropic mid-flight is a code change, not a config change, because the request shape and the prompt that works are both provider-specific.
Prompts drift, silently. The prompt that scored 0.91 on the launch eval is now scoring 0.74 on real users, and the only signal is a slow uptick in support tickets.
No grounded picture of cost / quality / latency. Per-call cost lives in one dashboard, latency in another, and quality lives in nobody's dashboard at all.

route-switch picks one opinionated answer: store the prompt template, the provider config, the dataset, and the analytics in the same process, and give the optimizer permission to rewrite the template when the data says it should.

What's in the box

An OpenAI-compatible HTTP gateway (/v1/chat/completions, streaming + non-streaming, plus /status, /health, /v1/system/analytics).
A prompt registry backed by YAML manifests with variable schemas. Templates are addressable by ID via request metadata.
A load balancer supporting round-robin, weighted, and performance-based strategies, with automatic fallback when a combination's success rate drops below a configured threshold.
A MIPROv2 optimizer — instruction + few-shot search powered by goptuna's Bayesian optimizer — that runs on demand or in the background.
Per-prompt SQLite datasets (one file per template) and a DuckDB analytics store for cross-prompt aggregates.
Portable packages: bundle a template + dataset snapshot + recent logs as a tarball and move it between environments.

What it deliberately isn't

route-switch isn't a semantic router that classifies a free-form user message and silently picks "the best model" from a universe of 200 endpoints. Routing happens across the prompt+model+provider combinations you registered; selection is strategy-driven, not LLM-driven.

It isn't an evaluation harness either. The optimizer scores candidates by replaying rows from your captured dataset under one of three built-in strategies (Similarity, ExactMatch, KeywordMatch). If you need LLM-as-judge, BLEU, ROUGE, or a classifier, implement the EvaluationStrategy Go interface and plug it in.

And it isn't a managed service. You build the binary, you run it, you keep the SQLite and DuckDB files.

Who built it

Skelf Research is the lab that ships this project. The code is MIT; contributions are welcome via the GitHub repo. For commercial questions or longer collaborations, reach Skelf Research via the parent site.

Status: early. The Go interfaces (ModelProvider, EvaluationStrategy) are stable enough to extend; the CLI surface and config schema may still move. Pin a commit or a tagged release.

A gateway that owns the prompt

Why this exists

What's in the box

What it deliberately isn't

Who built it

Related

How it works

Features

Quickstart

Run route-switch today