OpenAI’s Codex CLI is a highly capable terminal-based coding agent, but it is tied to OpenAI models out of the box. Bifrost CLI removes that limitation. By routing Codex through the Bifrost AI gateway, you can use models from Anthropic, Google, Mistral, and 15+ other providers without modifying configuration files. A single command gives you access to 1000+ models along with full observability.
Codex CLI: A Quick Primer
Codex CLI is OpenAI’s open-source coding agent designed to run inside your terminal. It can analyze your codebase, edit files, execute commands, and iterate on changes with human approval in the loop. Built in Rust, it includes features such as sandboxed execution, MCP server integration, subagent workflows, and web search.
By default, Codex runs on GPT-5.4 and supports OpenAI’s full suite of coding-focused models, including GPT-5.3-Codex and GPT-5.4-mini. It is available across ChatGPT Plus, Pro, Business, Edu, and Enterprise plans.
However, there is a limitation. Codex CLI connects directly to OpenAI’s API by default. Switching to another provider, such as Claude for advanced reasoning or Gemini for larger context windows, requires manually editing model_providers in config.toml, managing multiple API keys, and dealing with provider-specific differences.
This is the exact gap that Bifrost fills.
What Bifrost CLI Does Differently
Bifrost CLI is an interactive terminal tool that acts as a layer between your coding agent and AI providers. Instead of configuring environment variables, handling API keys, and wiring provider endpoints yourself, you run a single command:
npx -y @maximhq/bifrost-cli
The CLI guides you through setup: selecting your gateway URL, choosing a harness such as Codex CLI, Claude Code, Gemini CLI, or Opencode, picking a model from any provider, and launching the session. Bifrost automatically manages environment variables, base URLs, and API keys.
For Codex, Bifrost routes requests through its /openai provider path. This allows Codex to interact with a fully OpenAI-compatible API without requiring modifications or forks. It works as a seamless proxy that maps requests to the selected provider.
Setting Up Codex CLI with Bifrost
You can connect Codex CLI to Bifrost either through an automated setup or manual configuration.
Option 1: Bifrost CLI (Recommended)
Start your Bifrost gateway:
npx -y @maximhq/bifrost
Then launch Bifrost CLI in a separate terminal:
npx -y @maximhq/bifrost-cli
Choose Codex CLI as the harness, select a model, and continue. If Codex is not already installed, Bifrost CLI installs it, configures the necessary settings, and starts the agent with everything ready to go.
Option 2: Manual Configuration
If you prefer a manual approach, you can point Codex to your Bifrost gateway using environment variables:
export OPENAI_BASE_URL=http://localhost:8080/openai
export OPENAI_API_KEY=your-bifrost-virtual-key
codex
Both methods ensure that all Codex traffic is routed through Bifrost.
Running Non-OpenAI Models in Codex
This is where Bifrost becomes especially useful. It translates OpenAI API requests into provider-specific formats, allowing you to use different models directly within Codex:
Launch Codex with Claude
codex –model anthropic/claude-sonnet-4-5-20250929
Launch with Gemini
codex –model gemini/gemini-2.5-pro
Launch with Mistral
codex –model mistral/mistral-large-latest
You can also switch models during a session using the /model command:
/model anthropic/claude-sonnet-4-5-20250929
/model gemini/gemini-2.5-pro
Bifrost supports 1000+ models across across 20+ providers, including OpenAI, Azure, Anthropic, Google (Gemini and Vertex), AWS Bedrock, Mistral, Groq, Cerebras, Cohere, xAI, Ollama, OpenRouter, and others. One key requirement is that non-OpenAI models must support tool use, since Codex depends on function calling for file edits, terminal execution, and code operations.
Why Route Codex Through a Gateway
Using Bifrost as a gateway is not only about model flexibility. It introduces several important capabilities for production environments:
Automatic failover. With fallback chains, Bifrost retries requests with alternate providers if the primary one fails or hits rate limits, keeping sessions uninterrupted.
Load balancing. Requests can be distributed across multiple API keys or accounts to improve throughput and manage rate limits effectively.
Observability. Every request is logged with latency, token usage, and provider details. Bifrost’s native observability provides clear insight into agent behavior.
Cost governance. With virtual keys, you can assign budgets to developers, teams, or projects and track usage centrally.
Semantic caching. Using semantic caching, Bifrost can return cached responses for similar queries, reducing both latency and cost.
Bifrost CLI’s Tabbed Session UI
A notable feature of Bifrost CLI is its persistent tabbed terminal interface. When a session ends, you are not returned to a blank shell. Instead, a tab bar remains available at the bottom of the terminal where you can:
- Press Ctrl+B then n to start a new session, possibly with a different model or agent such as Claude Code
- Switch between sessions using h and l or number keys
- View status indicators that show whether sessions are active, idle, or waiting for input
This setup makes it easy to run multiple Codex sessions in parallel, compare outputs across models, or keep different agents open for reference.
Wrapping Up
Codex CLI stands out as one of the most capable terminal-based coding agents today. With Bifrost, you are no longer restricted to a single provider. You can run Codex with any model while gaining access to failover, observability, caching, and cost management at the gateway level.
