No description
  • Rust 97.2%
  • Shell 2.3%
  • Objective-C 0.5%
Find a file
Conrad Kramer 5e413976df
Some checks failed
ci / Rust (push) Failing after 20s
ci / macOS Desktop (push) Failing after 3m23s
Add Forgejo macOS desktop CI and release pipeline
2026-04-03 03:36:33 -07:00
.forgejo/workflows Add Forgejo macOS desktop CI and release pipeline 2026-04-03 03:36:33 -07:00
crates Add Forgejo macOS desktop CI and release pipeline 2026-04-03 03:36:33 -07:00
docs Add Forgejo macOS desktop CI and release pipeline 2026-04-03 03:36:33 -07:00
packaging/macos Add Forgejo macOS desktop CI and release pipeline 2026-04-03 03:36:33 -07:00
scripts/ci Add Forgejo macOS desktop CI and release pipeline 2026-04-03 03:36:33 -07:00
.gitignore Initial commit 2026-03-31 17:21:23 -07:00
Cargo.lock Build token-plumber harness and local node foundation 2026-04-02 23:44:09 -07:00
Cargo.toml Add Forgejo cargo-dist release workflow 2026-04-01 00:03:52 -07:00
dist-workspace.toml Add Forgejo cargo-dist release workflow 2026-04-01 00:03:52 -07:00
README.md Add Forgejo macOS desktop CI and release pipeline 2026-04-03 03:36:33 -07:00
rustfmt.toml Initial commit 2026-03-31 17:21:23 -07:00
token-plumber.example.toml Build token-plumber harness and local node foundation 2026-04-02 23:44:09 -07:00

token-plumber

Minimal, modular, axum-based LLM router with OpenRouter-style routing controls, streaming proxying, tracing, async request logging, and rclone-backed log sync.

Install

cargo install --path crates/token-plumber

This installs the tplumb executable.

Tagged releases are built by Forgejo Actions through cargo-dist on Namespace runner labels. The release workflow is at .forgejo/workflows/release.yml and uses:

  • namespace-profile-linux-medium
  • namespace-profile-macos-large
  • namespace-profile-windows-large

Push a tag like v0.1.0 to build a release, or run the workflow manually with an explicit tag input. If your Forgejo instance does not grant enough rights to the built-in workflow token for release creation and asset upload, add a repository secret named FORGEJO_RELEASE_TOKEN.

Run

mkdir -p "${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber"
cp token-plumber.example.toml "${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber/config.toml"
tplumb

tplumb resolves config in this order:

  • --config /path/to/config.toml
  • TOKEN_PLUMBER_CONFIG
  • legacy LLM_ROUTER_CONFIG
  • ${TOKEN_PLUMBER_HOME}/config.toml
  • ${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber/config.toml

tplumb serve is an explicit alias for the default server mode, so you can keep tplumb for the common path and still treat serving as a normal subcommand when composing scripts.

Built-in Auth Helpers

tplumb includes minimal Codex-compatible auth helpers so provider credentials can stay dense and command-backed without a second utility binary:

  • tplumb auth login
  • tplumb auth logout
  • tplumb auth list
  • tplumb auth inspect
  • tplumb auth chatgpt-access-token
  • tplumb auth chatgpt-account-id
  • tplumb auth openai-api-key
  • tplumb auth anthropic-login
  • tplumb auth anthropic-logout
  • tplumb auth anthropic-inspect
  • tplumb auth anthropic-access-token
  • tplumb auth anthropic-api-key

By default these commands use token-plumber-owned auth files under:

  • ${TOKEN_PLUMBER_HOME:-${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber}/auth/openai.json
  • ${TOKEN_PLUMBER_HOME:-${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber}/auth/openai.<binding>.json
  • ${TOKEN_PLUMBER_HOME:-${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber}/auth/anthropic.json
  • ${TOKEN_PLUMBER_HOME:-${XDG_CONFIG_HOME:-$HOME/.config}/token-plumber}/auth/anthropic.<binding>.json

Use --binding work only when you actually need a second stored credential set. --auth-file remains the escape hatch for explicit file-backed credentials.

The browser login flow is fully native Rust: tplumb auth login starts a local callback server, prints the authorize URL, exchanges the returned code for tokens, and stores the resulting auth in the selected binding file.

tplumb auth anthropic-login is also fully native Rust. It drives the Anthropic OAuth authorize/token/profile/api-key flow and can later emit either an Anthropic OAuth access token or a stored/minted Anthropic API key. The reader also accepts older field aliases when importing existing credential payloads.

If you want to embed one Anthropic credential file into the binary at build time, set:

  • TOKEN_PLUMBER_EMBED_ANTHROPIC_CREDENTIALS_JSON=/abs/path/to/anthropic.json
  • optional TOKEN_PLUMBER_EMBED_ANTHROPIC_CREDENTIALS_NAME=work

Then tplumb auth anthropic-inspect --credential-source embedded, anthropic-access-token --credential-source embedded, and anthropic-api-key --credential-source embedded will read from the embedded payload instead of the filesystem.

The token-printing commands still support selecting a specific stored ChatGPT identity with:

  • --email
  • --account-id
  • --org-id
  • --project-id

chatgpt-access-token and chatgpt-account-id expose the bearer/account pair that Codex itself uses. openai-api-key uses the same id_token exchange flow Codex login uses for API-key minting. If you pass --refresh, tplumb refreshes the stored ChatGPT tokens before reading or exchanging them.

OpenAI API-key exchange requires the stored ChatGPT auth to include organization_id. Accounts without that field can still produce direct ChatGPT bearer tokens, but they cannot mint an OpenAI API key.

Data Layout

tplumb keeps three storage classes separate:

  • provider auth under ${TOKEN_PLUMBER_HOME}/auth/
  • harness-native state under the resolved config directory's harness/ subtree, which is normally ${TOKEN_PLUMBER_HOME}/harness/
  • local-node rendezvous state under ${TOKEN_PLUMBER_HOME}/node/
  • request logs wherever [logging] points, for example sqlite_path, spool_dir, and audio.directory

With the standard ${TOKEN_PLUMBER_HOME}/config.toml layout, the harness-native roots are:

  • ${TOKEN_PLUMBER_HOME}/harness/codex as CODEX_HOME
  • ${TOKEN_PLUMBER_HOME}/harness/claude as CLAUDE_CONFIG_DIR
  • ${TOKEN_PLUMBER_HOME}/harness/opencode/{config,data,state} as OpenCode's XDG config, data, and state roots

The local-node paths are:

  • ${TOKEN_PLUMBER_HOME}/node/registry.sqlite for alias-to-node rendezvous
  • ${TOKEN_PLUMBER_HOME}/node/run/local-control.sock as the preferred shared local control socket
  • ${TOKEN_PLUMBER_HOME}/node/<alias>.lock for spawn locking

The local node model is documented in docs/token-plumber-node-architecture.md and follows the shared-local-node pattern used in ../rdq: one durable alias like local, compatibility-scoped concrete node ids, reuse before spawn, and future room for non-local transports like iroh.

Provider Config

tplumb can now model more than simple bearer-auth /v1/* providers. In provider config you can use:

  • base_url with ${ENV_VAR} interpolation
  • auth for bearer, anthropic, chatgpt, header, query, or none
  • credentials for dense multi-credential pools with token, token_env, or token_command
  • credentials[].kind to distinguish Anthropic api_key credentials from OAuth auth_token credentials
  • optional credential-side account_id, account_id_env, or account_id_command for auth modes that need a second identity header
  • default_headers and default_query
  • endpoints overrides per router surface when a providers OpenAI-compatible path differs

Credential pools are selected round-robin per request. If one configured credential source fails, tplumb will try the next credential before failing the request. token_command is an argv array, not shell text; it must print exactly one upstream token on stdout. cache_ttl_secs controls command result caching so account-backed token exchange helpers do not run for every request. Legacy api_key and api_key_env still work and are treated as shorthand credentials.

For OpenAI specifically, this means one provider can mix:

  • multiple direct OpenAI API keys
  • multiple Codex/ChatGPT-backed exchangers that mint OpenAI-compatible keys on demand

The router does not embed ChatGPT web login logic into the provider layer. Account flows stay command-backed so the transport layer remains data-driven and provider-agnostic.

For direct ChatGPT-backed OpenAI routing, use auth = { type = "chatgpt" }. In that mode tplumb injects both Authorization: Bearer ... and ChatGPT-Account-ID: ..., and it will fail fast with a readable error if the credential resolves a token but no account id.

For native Anthropic routing, use auth = { type = "anthropic" }. In that mode tplumb maps credentials like this:

  • kind = "api_key" sends x-api-key
  • kind = "auth_token" sends Authorization: Bearer ... and the Anthropic OAuth beta header

The example config includes BYOK-ready entries for the OpenRouter screenshot providers that expose OpenAI-compatible endpoints directly or through a compatibility layer, including Bedrock, Anthropic, Cerebras, Cloudflare, Fireworks, Google AI Studio, Vertex, Groq, Mistral, Parasail, SambaNova, xAI, and Z.ai. Anthropics own docs describe their OpenAI compatibility layer as useful for testing/comparison rather than their preferred long-term production path.

Codex MCP Proxying

Codex streamable-HTTP MCP servers can point at tplumb directly. Model traffic still uses /v1/responses, /v1/chat/completions, or /v1/messages; remote MCP servers use the dedicated MCP proxy surface:

  • http://127.0.0.1:8080/v1/mcp/<provider>
  • http://127.0.0.1:8080/api/v1/mcp/<provider>

Define the upstream MCP server as a normal provider and set base_url to the exact MCP endpoint URL, for example https://example.com/mcp. Token-plumber keeps router auth local, injects upstream provider credentials from providers[].credentials, and preserves the streamable-HTTP request/response headers and body shape.

For Codex, that maps cleanly onto mcp_servers.<name>.url:

[mcp_servers.remote_files]
url = "http://127.0.0.1:8080/v1/mcp/remote-files"
http_headers = { authorization = "Bearer local-dev-token" }

If the upstream MCP server advertises OAuth metadata, tplumb also exposes Codex-compatible discovery aliases for:

  • /.well-known/oauth-authorization-server/v1/mcp/<provider>
  • /v1/mcp/<provider>/.well-known/oauth-authorization-server

Sandboxed Harnesses

tplumb codex, tplumb claude, and tplumb opencode start an embedded local proxy, reuse a token-plumber-managed local client token, then launch the child under sandbox-exec with network egress limited to that proxy port.

  • tplumb codex sets CODEX_HOME=${TOKEN_PLUMBER_HOME}/harness/codex, writes native config.toml and auth.json, and injects OPENAI_BASE_URL=https://localhost:<port>/v1
  • tplumb claude sets CLAUDE_CONFIG_DIR=${TOKEN_PLUMBER_HOME}/harness/claude, writes native settings.json plus stable MCP helper/config files, and injects ANTHROPIC_BASE_URL=https://localhost:<port>
  • tplumb opencode writes native OpenCode config under ${TOKEN_PLUMBER_HOME}/harness/opencode/config/opencode/opencode.json, keeps OpenCode data/state under the matching XDG roots, and constrains OpenCode to a single custom tplumb provider backed by the local proxy
  • all three launchers keep upstream provider auth inside tplumb; the child only gets a local proxy credential/config view
  • native client approval prompts are intentionally bypassed for harnessed Codex/Claude runs; allow/deny/ask decisions are owned by tplumb's proxy layer instead
  • Codex and Claude default to transport = "https", backed by a generated local CA under ${TOKEN_PLUMBER_HOME}/harness/tls/; Codex gets SSL_CERT_FILE, Claude gets both SSL_CERT_FILE and NODE_EXTRA_CA_CERTS
  • set transport = "http" under [harness.codex] or [harness.claude] to fall back to plain localhost transport
  • Codex and Claude also get an explicit local HTTP[S]_PROXY; configure per-domain proxy policy under [harness.proxy.domains."<domain>"]
  • mode = "mitm" decrypts and logs that domain's HTTPS traffic while mode = "connect" leaves it as a CONNECT tunnel
  • default_action, allow_paths, deny_paths, and ask_paths apply when tplumb sees plaintext for that request
  • GraphQL-aware policy lives under [harness.proxy.domains."<domain>".graphql] and supports mode = "allow_all" | "queries_only" | "deny_all" plus per-operation allow/deny/ask lists
  • MCP tool policy is centralized under [harness.permissions] using Claude-style names: mcp__server, mcp__server__*, mcp__server__tool
  • when [harness.proxy].approval_token_env_var is set, ask decisions can be auto-approved by passing that token through the local proxy header x-token-plumber-approval
  • plaintext-captured HTTPS/API traffic is written to ${TOKEN_PLUMBER_HOME}/harness/<program>/proxy-traffic.jsonl
  • local MCP request/response traffic is written to ${TOKEN_PLUMBER_HOME}/harness/<program>/mcp-traffic.jsonl

For a dedicated Codex websocket app-server under the same harness, use:

tplumb codex-app-server --listen ws://127.0.0.1:7777

That runs codex app-server behind the existing tplumb proxy/auth/policy layer instead of requiring a manual passthrough command. The broader desktop app direction lives in docs/desktop-architecture.md.

Named MCP aliases live under [harness.mcp_servers] and map cleanly onto provider slugs:

[harness.codex]
mcp_servers = ["files", "linear"]
transport = "https"

[harness.claude]
mcp_servers = ["files", "linear"]
transport = "https"

[harness.opencode]
mcp_servers = ["files", "linear"]
model = "anthropic/claude-sonnet-4-5"

[harness.proxy]
approval_token_env_var = "TOKEN_PLUMBER_APPROVAL_TOKEN"

[harness.proxy.domains."api.openai.com"]
mode = "mitm"
default_action = "allow"
ask_paths = ["/v1/files"]

[harness.proxy.domains."api.linear.app"]
mode = "mitm"

[harness.proxy.domains."api.linear.app".graphql]
mode = "queries_only"
ask_operations = ["UpdateIssue"]

[harness.permissions]
allow = ["mcp__files", "mcp__linear__read_issue"]
deny = ["mcp__files__delete"]
ask = ["mcp__linear__update_issue"]

[harness.mcp_servers.files]
provider = "remote-files"

[harness.mcp_servers.linear]
provider = "linear"

With that in place:

tplumb codex
tplumb claude
tplumb opencode

All three launchers also accept --command /path/to/binary plus trailing arguments to target a specific executable.

CI and Releases

Forgejo workflows live under .forgejo/workflows/.

  • ci.yml runs workspace verification and a macOS desktop bundle smoke build
  • release.yml keeps the cargo-dist CLI release path and also builds a notarized macOS Token Plumber.app bundle when Apple signing secrets are configured

The macOS packaging and notarization flow is documented in docs/macos-release-pipeline.md.

Log Sync

[logging.rclone] supports backend = "librclone" for in-process syncs or backend = "command" to shell out to the rclone binary. max_concurrency maps to rclone transfer parallelism, and command-backend failures preserve stderr in the returned error.

Current surface

  • GET /healthz
  • GET /v1/models and GET /api/v1/models
  • GET /v1/providers and GET /api/v1/providers
  • POST /v1/chat/completions and POST /api/v1/chat/completions
  • POST /v1/responses and POST /api/v1/responses
  • POST /v1/embeddings and POST /api/v1/embeddings
  • POST /v1/audio/transcriptions and POST /api/v1/audio/transcriptions
  • ANY /v1/mcp/{provider} and ANY /api/v1/mcp/{provider}
  • ANY /v1/proxy/{provider} and ANY /api/v1/proxy/{provider}

Provider key overrides

Per-request BYOK-style overrides are supported through headers:

  • X-Provider-Key-<provider-slug>: ...
  • OpenRouter app identity headers are forwarded upstream: HTTP-Referer, X-OpenRouter-Title, X-OpenRouter-Categories

Example:

curl http://127.0.0.1:8080/v1/chat/completions \
  -H 'Authorization: Bearer local-dev-token' \
  -H 'Content-Type: application/json' \
  -H 'X-Provider-Key-openai: sk-...' \
  -d '{"model":"gpt-4.1-mini","messages":[{"role":"user","content":"hello"}],"stream":true}'