Supported Models
This page lists all supported models for our async API, which uses a queue-and-poll flow. Use this reference to pick the model that fits your needs.
All models are available on all tiers (Free, Starter, Scale, Enterprise).
New models are added automatically as they launch.
Available Models
| Model | Model ID | Best For | Processing Speed |
|---|---|---|---|
| GPT-5.2 | gpt-5-2 |
General purpose, fast responses | Fast |
| GPT-5.2 Thinking | gpt-5-2-thinking |
Complex reasoning, deep analysis | Slow |
| Grok-4.1 | grok-4-1-fast-non-reasoning |
General purpose, creative tasks | Fast |
| Grok-4.1 Thinking | grok-4-1-fast-reasoning |
Extended reasoning, problem solving | Slow |
| Gemini 3 Pro | gemini-3-pro |
Long context, comprehensive analysis | Medium |
| Gemini 2.5 Pro | gemini-2-5-pro |
Balanced reasoning and speed | Medium |
| Gemini 2.5 Flash | gemini-2-5-flash |
Quick responses, simple tasks | Very Fast |
| DeepSeek v3.2 | deepseek-v3-2 |
General purpose, versatile tasks | Fast |
| Gemini 3 Flash | gemini-3-flash |
General purpose, high efficiency | Very Fast |
Understanding 'Thinking' Models
Model IDs containing "-thinking" or "-reasoning" are convenience identifiers that configure the underlying model with provider-specific parameters to enable extended reasoning.
How it works:
-
OpenAI models (GPT-5.2): The base model is unified. When you use
gpt-5-2-thinking, we set thereasoning.effortparameter to a higher value (e.g.,mediumorhigh). Withgpt-5-2, we usereasoning.effort: nonefor faster responses. -
Google models (Gemini): Similar approach with Google's native thinking mode parameters.
-
xAI models (Grok): Grok provides separate model variants:
grok-4-1-fast-non-reasoningfor standard tasks andgrok-4-1-fast-reasoningfor reasoning tasks.
For users familiar with official APIs: If you've been using OpenAI's or Google's official APIs directly, you may be accustomed to controlling reasoning through API parameters (like reasoning.effort in GPT-5.2). Our Model ID abstraction simplifies this—just choose the "-thinking" or "-reasoning" variant to enable extended reasoning, without needing to configure provider-specific parameters yourself.
How to Choose
For Balanced Speed and Quality: Use Standard Models
If you need responses quickly and your task is straightforward:
- GPT-5.2 (
gpt-5-2) - Grok-4.1 (
grok-4-1-fast-non-reasoning) - Gemini 2.5 Flash (
gemini-2-5-flash) - Gemini 3 Flash (
gemini-3-flash) - DeepSeek v3.2 (
deepseek-v3-2)
These models process quickly, making them suitable for most general-purpose applications.
For Complex Tasks: Use Thinking Models
If you need deep reasoning, complex problem solving, or analysis:
- GPT-5.2 Thinking (
gpt-5-2-thinking) - Grok-4.1 Thinking (
grok-4-1-fast-reasoning)
Thinking models take significantly longer because they perform extended reasoning before responding. Use these when quality matters more than speed.
For Long Context: Use Pro Models
If you're working with large documents or need comprehensive analysis:
- Gemini 3 Pro (
gemini-3-pro) - Gemini 2.5 Pro (
gemini-2-5-pro)
These handle larger context windows and longer tasks effectively.
Using a Model
Specify the model ID in your request:
import requests
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
response = requests.post(
"https://app.beginswithai.com/v1/ai",
json={
"model": "gpt-5-2",
"prompt": "Explain quantum computing"
},
headers=headers
)
request_id = response.json()["request_id"]
const headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
};
const response = await fetch("https://app.beginswithai.com/v1/ai", {
method: "POST",
headers: headers,
body: JSON.stringify({
model: "gpt-5-2",
prompt: "Explain quantum computing"
})
});
const { request_id } = await response.json();
Model Selection Examples
Example 1: Generate a Product Description (Fast Task)
Use a light, fast model:
Processing time: 2-5 seconds
Example 2: Analyze Legal Documents (Complex Task)
Use a thinking model for thorough analysis:
{
"model": "gpt-5-2-thinking",
"prompt": "Identify and summarize all liability clauses in this contract: [contract text]"
}
Processing time: 30-90 seconds (thinking models require extended reasoning)
Example 3: Summarize Long Articles (Large Context)
Use a Pro model for comprehensive handling:
Processing time: 8-20 seconds
Example 4: Creative Writing (Standard Model)
Use a general-purpose model for creative tasks:
{
"model": "grok-4-1-fast-non-reasoning",
"prompt": "Write a short sci-fi story about an AI discovering consciousness"
}
Processing time: 5-12 seconds
Processing Time Expectations
Important
This is an async API. You submit a request, receive a request_id, and poll for results using GET /v1/ai. Processing times below indicate how long the model takes to generate a response—you will need to poll periodically to retrieve the completed result.
Light Models (Fast):
- Simple prompts: 2-5 seconds
- Moderate prompts: 5-12 seconds
Standard Models (Medium):
- Simple prompts: 3-8 seconds
- Moderate prompts: 8-20 seconds
Thinking Models (Slow):
- Simple prompts: 15-40 seconds
- Complex prompts: 30-90 seconds (varies based on reasoning complexity)
Pro Models (Medium):
- Simple prompts: 4-10 seconds
- Large context: 12-30 seconds
Note
Processing times vary based on model availability, queue depth, and prompt complexity. Always implement polling with appropriate intervals rather than assuming a fixed completion time.
Error Handling
If you request an unsupported model, you'll receive a 400 Bad Request error:
Supported model IDs are always listed in the table above. Use the exact Model ID from the table in your requests.
See Error Handling for more details.