Skip to main content

Groq

Groq delivers ultra-low-latency inference via its LPU (Language Processing Unit) hardware. It is OpenAI API-compatible, making integration straightforward.
Groq is notably faster than GPU-based cloud providers for tokens-per-second throughput — ideal for latency-sensitive agent tasks.

Prerequisites

Create an API key at console.groq.com and set it on the gateway host:
echo 'GROQ_API_KEY=your-key-here' >> ~/.openclaw/.env

Configuration

{
  models: {
    providers: {
      groq: {
        apiKey: "GROQ_API_KEY", // references the env var
        models: [
          "llama-3.3-70b-versatile",
          "llama-3.1-8b-instant",
          "mixtral-8x7b-32768",
          "gemma2-9b-it",
        ],
      },
    },
  },
}
apiKey is the name of the environment variable, not the key value itself. Keep secrets in .env, not in committed config files.

Selecting a model

{
  agents: {
    defaults: {
      model: { primary: "groq/llama-3.3-70b-versatile" },
    },
  },
}
ModelNotes
llama-3.3-70b-versatileBest general-purpose Llama model on Groq
llama-3.1-8b-instantFastest — lowest latency
mixtral-8x7b-32768Large context window (32k tokens)
gemma2-9b-itGoogle Gemma 2 9B instruction-tuned
Run openclaw models list --provider groq for the full current catalog.

Troubleshooting

The API key is invalid or revoked. Generate a new one at console.groq.com and update ~/.openclaw/.env.
Groq free-tier limits are generous but finite. Check your usage at the Groq console or add a paid plan.
Groq’s model catalog changes frequently. Run openclaw models list --provider groq for the current list.