Grok 4.1 Fast Non-Reasoning API via TokenMix

Use Grok 4.1 Fast Non-Reasoning from xAI as a chat model through the TokenMix AI API relay and multi-model gateway.

Low-latency, non-reasoning variant of Grok 4.1 Fast with 2M context window. Delivers fast responses without extended thinking while maintaining frontier-level tool-calling and agentic capabilities.

API access

Base URL: https://api.tokenmix.ai/v1
Model ID: grok-4.1-fast-non-reasoning
OpenAI SDK compatible. Change the base URL and use your TokenMix API key.

Pricing

Input $0.19/M tokens, output $0.475/M tokens

Capabilities

Vision, Function calling, JSON mode, Streaming

Model specs

Context: 2000K tokens
Max output: 30K tokens

Availability

1/1 available API endpoints are healthy right now.

Recent performance

TTFT 2026ms, latency 5127ms, throughput 273.9 tok/s.

Start using this model

Create an API key, top up from $1 when needed, and call this model through the TokenMix OpenAI-compatible endpoint.

Create API key · View pricing · Quickstart