Skip to content

Supported Models

This page lists all supported models for our async API, which uses a queue-and-poll flow. Use this reference to pick the model that fits your needs.

All models are available on all tiers (Free, Starter, Scale, Enterprise).

New models are added automatically as they launch.

Available Models

Model Model ID Best For Processing Speed
GPT-5.2 gpt-5-2 General purpose, fast responses Fast
GPT-5.2 Thinking gpt-5-2-thinking Complex reasoning, deep analysis Slow
Grok-4.1 grok-4-1-fast-non-reasoning General purpose, creative tasks Fast
Grok-4.1 Thinking grok-4-1-fast-reasoning Extended reasoning, problem solving Slow
Gemini 3 Pro gemini-3-pro Long context, comprehensive analysis Medium
Gemini 2.5 Pro gemini-2-5-pro Balanced reasoning and speed Medium
Gemini 2.5 Flash gemini-2-5-flash Quick responses, simple tasks Very Fast
DeepSeek v3.2 deepseek-v3-2 General purpose, versatile tasks Fast
Gemini 3 Flash gemini-3-flash General purpose, high efficiency Very Fast

Understanding 'Thinking' Models

Model IDs containing "-thinking" or "-reasoning" are convenience identifiers that configure the underlying model with provider-specific parameters to enable extended reasoning.

How it works:

  • OpenAI models (GPT-5.2): The base model is unified. When you use gpt-5-2-thinking, we set the reasoning.effort parameter to a higher value (e.g., medium or high). With gpt-5-2, we use reasoning.effort: none for faster responses.

  • Google models (Gemini): Similar approach with Google's native thinking mode parameters.

  • xAI models (Grok): Grok provides separate model variants: grok-4-1-fast-non-reasoning for standard tasks and grok-4-1-fast-reasoning for reasoning tasks.

For users familiar with official APIs: If you've been using OpenAI's or Google's official APIs directly, you may be accustomed to controlling reasoning through API parameters (like reasoning.effort in GPT-5.2). Our Model ID abstraction simplifies this—just choose the "-thinking" or "-reasoning" variant to enable extended reasoning, without needing to configure provider-specific parameters yourself.


How to Choose

For Balanced Speed and Quality: Use Standard Models

If you need responses quickly and your task is straightforward:

  • GPT-5.2 (gpt-5-2)
  • Grok-4.1 (grok-4-1-fast-non-reasoning)
  • Gemini 2.5 Flash (gemini-2-5-flash)
  • Gemini 3 Flash (gemini-3-flash)
  • DeepSeek v3.2 (deepseek-v3-2)

These models process quickly, making them suitable for most general-purpose applications.

For Complex Tasks: Use Thinking Models

If you need deep reasoning, complex problem solving, or analysis:

  • GPT-5.2 Thinking (gpt-5-2-thinking)
  • Grok-4.1 Thinking (grok-4-1-fast-reasoning)

Thinking models take significantly longer because they perform extended reasoning before responding. Use these when quality matters more than speed.

For Long Context: Use Pro Models

If you're working with large documents or need comprehensive analysis:

  • Gemini 3 Pro (gemini-3-pro)
  • Gemini 2.5 Pro (gemini-2-5-pro)

These handle larger context windows and longer tasks effectively.


Using a Model

Specify the model ID in your request:

Python
import requests

headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

response = requests.post(
    "https://app.beginswithai.com/v1/ai",
    json={
        "model": "gpt-5-2",
        "prompt": "Explain quantum computing"
    },
    headers=headers
)

request_id = response.json()["request_id"]
JavaScript
const headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
};

const response = await fetch("https://app.beginswithai.com/v1/ai", {
    method: "POST",
    headers: headers,
    body: JSON.stringify({
        model: "gpt-5-2",
        prompt: "Explain quantum computing"
    })
});

const { request_id } = await response.json();

Model Selection Examples

Example 1: Generate a Product Description (Fast Task)

Use a light, fast model:

{
    "model": "gemini-2-5-flash",
    "prompt": "Write a 50-word product description for a coffee maker"
}

Processing time: 2-5 seconds

Use a thinking model for thorough analysis:

{
    "model": "gpt-5-2-thinking",
    "prompt": "Identify and summarize all liability clauses in this contract: [contract text]"
}

Processing time: 30-90 seconds (thinking models require extended reasoning)

Example 3: Summarize Long Articles (Large Context)

Use a Pro model for comprehensive handling:

{
    "model": "gemini-3-pro",
    "prompt": "Summarize this 5000-word article: [article text]"
}

Processing time: 8-20 seconds

Example 4: Creative Writing (Standard Model)

Use a general-purpose model for creative tasks:

{
    "model": "grok-4-1-fast-non-reasoning",
    "prompt": "Write a short sci-fi story about an AI discovering consciousness"
}

Processing time: 5-12 seconds


Processing Time Expectations

Important

This is an async API. You submit a request, receive a request_id, and poll for results using GET /v1/ai. Processing times below indicate how long the model takes to generate a response—you will need to poll periodically to retrieve the completed result.

Light Models (Fast):

  • Simple prompts: 2-5 seconds
  • Moderate prompts: 5-12 seconds

Standard Models (Medium):

  • Simple prompts: 3-8 seconds
  • Moderate prompts: 8-20 seconds

Thinking Models (Slow):

  • Simple prompts: 15-40 seconds
  • Complex prompts: 30-90 seconds (varies based on reasoning complexity)

Pro Models (Medium):

  • Simple prompts: 4-10 seconds
  • Large context: 12-30 seconds

Note

Processing times vary based on model availability, queue depth, and prompt complexity. Always implement polling with appropriate intervals rather than assuming a fixed completion time.


Error Handling

If you request an unsupported model, you'll receive a 400 Bad Request error:

{
    "error": "invalid_request",
    "message": "Model 'invalid-model' is not supported"
}

Supported model IDs are always listed in the table above. Use the exact Model ID from the table in your requests.

See Error Handling for more details.