Supported Models

This page lists all supported models for our async API, which uses a queue-and-poll flow. Use this reference to pick the model that fits your needs.

All models are available on all tiers (Free, Starter, Scale, Enterprise).

New models are added automatically as they launch.

Available Models

Model	Model ID	Best For	Processing Speed
GPT-5.2	`gpt-5-2`	General purpose, fast responses	Fast
GPT-5.2 Thinking	`gpt-5-2-thinking`	Complex reasoning, deep analysis	Slow
Grok-4.1	`grok-4-1-fast-non-reasoning`	General purpose, creative tasks	Fast
Grok-4.1 Thinking	`grok-4-1-fast-reasoning`	Extended reasoning, problem solving	Slow
Gemini 3 Pro	`gemini-3-pro`	Long context, comprehensive analysis	Medium
Gemini 2.5 Pro	`gemini-2-5-pro`	Balanced reasoning and speed	Medium
Gemini 2.5 Flash	`gemini-2-5-flash`	Quick responses, simple tasks	Very Fast
DeepSeek v3.2	`deepseek-v3-2`	General purpose, versatile tasks	Fast
Gemini 3 Flash	`gemini-3-flash`	General purpose, high efficiency	Very Fast

Understanding 'Thinking' Models

Model IDs containing "-thinking" or "-reasoning" are convenience identifiers that configure the underlying model with provider-specific parameters to enable extended reasoning.

How it works:

OpenAI models (GPT-5.2): The base model is unified. When you use gpt-5-2-thinking, we set the reasoning.effort parameter to a higher value (e.g., medium or high). With gpt-5-2, we use reasoning.effort: none for faster responses.
Google models (Gemini): Similar approach with Google's native thinking mode parameters.
xAI models (Grok): Grok provides separate model variants: grok-4-1-fast-non-reasoning for standard tasks and grok-4-1-fast-reasoning for reasoning tasks.

For users familiar with official APIs: If you've been using OpenAI's or Google's official APIs directly, you may be accustomed to controlling reasoning through API parameters (like reasoning.effort in GPT-5.2). Our Model ID abstraction simplifies this—just choose the "-thinking" or "-reasoning" variant to enable extended reasoning, without needing to configure provider-specific parameters yourself.

How to Choose

For Balanced Speed and Quality: Use Standard Models

If you need responses quickly and your task is straightforward:

GPT-5.2 (gpt-5-2)
Grok-4.1 (grok-4-1-fast-non-reasoning)
Gemini 2.5 Flash (gemini-2-5-flash)
Gemini 3 Flash (gemini-3-flash)
DeepSeek v3.2 (deepseek-v3-2)

These models process quickly, making them suitable for most general-purpose applications.

For Complex Tasks: Use Thinking Models

If you need deep reasoning, complex problem solving, or analysis:

GPT-5.2 Thinking (gpt-5-2-thinking)
Grok-4.1 Thinking (grok-4-1-fast-reasoning)

Thinking models take significantly longer because they perform extended reasoning before responding. Use these when quality matters more than speed.

For Long Context: Use Pro Models

If you're working with large documents or need comprehensive analysis:

Gemini 3 Pro (gemini-3-pro)
Gemini 2.5 Pro (gemini-2-5-pro)

These handle larger context windows and longer tasks effectively.

Using a Model

Specify the model ID in your request:

Python

import requests

headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

response = requests.post(
    "https://app.beginswithai.com/v1/ai",
    json={
        "model": "gpt-5-2",
        "prompt": "Explain quantum computing"
    },
    headers=headers
)

request_id = response.json()["request_id"]

JavaScript

const headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
};

const response = await fetch("https://app.beginswithai.com/v1/ai", {
    method: "POST",
    headers: headers,
    body: JSON.stringify({
        model: "gpt-5-2",
        prompt: "Explain quantum computing"
    })
});

const { request_id } = await response.json();

Model Selection Examples

Example 1: Generate a Product Description (Fast Task)

Use a light, fast model:

{
    "model": "gemini-2-5-flash",
    "prompt": "Write a 50-word product description for a coffee maker"
}

Processing time: 2-5 seconds

Example 2: Analyze Legal Documents (Complex Task)

Use a thinking model for thorough analysis:

{
    "model": "gpt-5-2-thinking",
    "prompt": "Identify and summarize all liability clauses in this contract: [contract text]"
}

Processing time: 30-90 seconds (thinking models require extended reasoning)

Example 3: Summarize Long Articles (Large Context)

Use a Pro model for comprehensive handling:

{
    "model": "gemini-3-pro",
    "prompt": "Summarize this 5000-word article: [article text]"
}

Processing time: 8-20 seconds

Example 4: Creative Writing (Standard Model)

Use a general-purpose model for creative tasks:

{
    "model": "grok-4-1-fast-non-reasoning",
    "prompt": "Write a short sci-fi story about an AI discovering consciousness"
}

Processing time: 5-12 seconds

Processing Time Expectations

Important

This is an async API. You submit a request, receive a request_id, and poll for results using GET /v1/ai. Processing times below indicate how long the model takes to generate a response—you will need to poll periodically to retrieve the completed result.

Light Models (Fast):

Simple prompts: 2-5 seconds
Moderate prompts: 5-12 seconds

Standard Models (Medium):

Simple prompts: 3-8 seconds
Moderate prompts: 8-20 seconds

Thinking Models (Slow):

Simple prompts: 15-40 seconds
Complex prompts: 30-90 seconds (varies based on reasoning complexity)

Pro Models (Medium):

Simple prompts: 4-10 seconds
Large context: 12-30 seconds

Note

Processing times vary based on model availability, queue depth, and prompt complexity. Always implement polling with appropriate intervals rather than assuming a fixed completion time.

Error Handling

If you request an unsupported model, you'll receive a 400 Bad Request error:

{
    "error": "invalid_request",
    "message": "Model 'invalid-model' is not supported"
}

Supported model IDs are always listed in the table above. Use the exact Model ID from the table in your requests.

See Error Handling for more details.