How Skyvern uses LLMs
Skyvern makes multiple LLM calls per task step:- Screenshot analysis: Identify interactive elements on the page
- Action planning: Decide what to click, type, or extract
- Result extraction: Parse data from the page into structured output
LLM_KEY. Skyvern also supports a SECONDARY_LLM_KEY for lighter tasks to reduce costs.
Quick Start Recommendations
Best models for production (2025):| Provider | Primary Model | Secondary Model | Notes |
|---|---|---|---|
| Anthropic | ANTHROPIC_CLAUDE4.5_OPUS | ANTHROPIC_CLAUDE4.5_SONNET | Most capable |
| OpenAI | OPENAI_GPT5 | OPENAI_GPT5_MINI | Latest |
GEMINI_3_PRO | GEMINI_3.0_FLASH | Latest | |
| AWS Bedrock | BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILE | BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE | Latest Claude |
OpenAI
The most common choice. Requires an API key from platform.openai.com..env
Available models
| LLM_KEY | Notes |
|---|---|
| GPT-5 Series | |
OPENAI_GPT5 | Recommended for most complex tasks |
OPENAI_GPT5_MINI | |
OPENAI_GPT5_MINI_FLEX | Flex service tier, 15min timeout |
OPENAI_GPT5_NANO | |
OPENAI_GPT5_1 | |
OPENAI_GPT5_2 | |
OPENAI_GPT5_4 | |
| GPT-4 Series | |
OPENAI_GPT4O | |
OPENAI_GPT4O_MINI | |
OPENAI_GPT4_1 | |
OPENAI_GPT4_1_MINI | |
OPENAI_GPT4_1_NANO | |
OPENAI_GPT4_5 | |
OPENAI_GPT4_TURBO | Legacy |
OPENAI_GPT4V | Legacy alias |
| O-Series (Reasoning) | |
OPENAI_O4_MINI | Vision support |
OPENAI_O3 | Vision support |
OPENAI_O3_MINI | No vision |
Optional settings
.env
Anthropic
Claude models from anthropic.com..env
Available models
| LLM_KEY | Notes |
|---|---|
| Claude 4.6 | |
ANTHROPIC_CLAUDE4.6_OPUS | Newest |
| Claude 4.5 | |
ANTHROPIC_CLAUDE4.5_OPUS | Recommended for primary use |
ANTHROPIC_CLAUDE4.5_SONNET | Recommended for secondary use |
ANTHROPIC_CLAUDE4.5_HAIKU | Fastest |
| Claude 4 | |
ANTHROPIC_CLAUDE4_OPUS | |
ANTHROPIC_CLAUDE4_SONNET | |
| Claude 3.7 | |
ANTHROPIC_CLAUDE3.7_SONNET | |
| Claude 3.5 | |
ANTHROPIC_CLAUDE3.5_SONNET | |
ANTHROPIC_CLAUDE3.5_HAIKU | |
| Claude 3 (Legacy) | |
ANTHROPIC_CLAUDE3_OPUS | |
ANTHROPIC_CLAUDE3_SONNET | |
ANTHROPIC_CLAUDE3_HAIKU |
Azure OpenAI
Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned..env
Setup steps
- Create an Azure OpenAI resource in the Azure Portal
- Open the Azure AI Foundry portal from your resource’s overview page
- Go to Shared Resources → Deployments
- Click Deploy Model → Deploy Base Model → select GPT-4o or GPT-4
- Note the Deployment Name. Use this for
AZURE_DEPLOYMENT - Copy your API key and endpoint from the Azure Portal
The
AZURE_DEPLOYMENT is the name you chose when deploying the model, not the model name itself.Google Gemini
Skyvern supports Gemini through two paths: the Gemini API (simpler, uses an API key) and Vertex AI (enterprise, uses a GCP service account).Gemini API
The quickest way to use Gemini. Get an API key from Google AI Studio..env
Available Gemini API models
| LLM_KEY | Notes |
|---|---|
| Gemini 3 | |
GEMINI_3_PRO | Recommended for primary use |
GEMINI_3.0_FLASH | Recommended for secondary use |
| Gemini 2.5 | |
GEMINI_2.5_PRO | |
GEMINI_2.5_PRO_PREVIEW | |
GEMINI_2.5_PRO_EXP_03_25 | Experimental |
GEMINI_2.5_FLASH | |
GEMINI_2.5_FLASH_PREVIEW | |
| Gemini 2.0 | |
GEMINI_FLASH_2_0 | |
GEMINI_FLASH_2_0_LITE | |
| Gemini 1.5 Legacy | |
GEMINI_PRO | |
GEMINI_FLASH |
Vertex AI
For enterprise deployments through Vertex AI. Requires a GCP project with Vertex AI enabled..env
If you’re migrating from an older Skyvern version,
VERTEX_LOCATION replaces the previous GCP_REGION variable. Update your .env accordingly.- Create a GCP project with billing enabled
- Enable the Vertex AI API in your project
- Create a service account with the Vertex AI User role
- Download the service account JSON key file
- Set
GOOGLE_APPLICATION_CREDENTIALSto the path of that file
For global endpoint access, set
VERTEX_LOCATION=global and ensure VERTEX_PROJECT_ID is set. Not all models support the global endpoint.Available Vertex AI models
| LLM_KEY | Notes |
|---|---|
| Gemini 3 | |
VERTEX_GEMINI_3_PRO | Recommended for primary use |
VERTEX_GEMINI_3.0_FLASH | Recommended for secondary use |
| Gemini 2.5 | |
VERTEX_GEMINI_2.5_PRO | |
VERTEX_GEMINI_2.5_PRO_PREVIEW | |
VERTEX_GEMINI_2.5_FLASH | |
VERTEX_GEMINI_2.5_FLASH_LITE | |
VERTEX_GEMINI_2.5_FLASH_PREVIEW | |
| Gemini 2.0 | |
VERTEX_GEMINI_FLASH_2_0 | |
| Gemini 1.5 Legacy | |
VERTEX_GEMINI_PRO | |
VERTEX_GEMINI_FLASH |
Amazon Bedrock
Run Anthropic Claude through your AWS account..env
Setup steps
- Create an IAM user with
AmazonBedrockFullAccesspolicy - Generate access keys for the IAM user
- In the Bedrock console, go to Model Access
- Enable access to Claude 3.5 Sonnet
Available models
| LLM_KEY | Notes |
|---|---|
| Amazon Nova (AWS Native) | |
BEDROCK_AMAZON_NOVA_PRO | |
BEDROCK_AMAZON_NOVA_LITE | |
| Claude 4.6 | |
BEDROCK_ANTHROPIC_CLAUDE4.6_OPUS_INFERENCE_PROFILE | Cross-region |
| Claude 4.5 | |
BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILE | Cross-region |
BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE | Cross-region |
| Claude 4 | |
BEDROCK_ANTHROPIC_CLAUDE4_OPUS_INFERENCE_PROFILE | Cross-region |
BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILE | Cross-region |
| Claude 3.7 | |
BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILE | Cross-region |
| Claude 3.5 | |
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET | v2 |
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1 | |
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_INFERENCE_PROFILE | Cross-region |
BEDROCK_ANTHROPIC_CLAUDE3.5_HAIKU | |
| Claude 3 (Legacy) | |
BEDROCK_ANTHROPIC_CLAUDE3_OPUS | |
BEDROCK_ANTHROPIC_CLAUDE3_SONNET | |
BEDROCK_ANTHROPIC_CLAUDE3_HAIKU |
Bedrock inference profile keys (
*_INFERENCE_PROFILE) use cross-region inference and require AWS_REGION only. No access keys needed if running on an IAM-authenticated instance.MiniMax
MiniMax models with vision support..env
Available models
| LLM_KEY | Notes |
|---|---|
MINIMAX_M2_5 | |
MINIMAX_M2_5_HIGHSPEED | Faster variant |
Optional settings
.env
VolcEngine (ByteDance Doubao)
VolcEngine provides access to ByteDance’s Doubao models with vision support..env
Available models
| LLM_KEY | Notes |
|---|---|
VOLCENGINE_DOUBAO_SEED_1_6 | Recommended for general use |
VOLCENGINE_DOUBAO_SEED_1_6_FLASH | Faster variant |
VOLCENGINE_DOUBAO_1_5_THINKING_VISION_PRO | Reasoning model |
Optional settings
.env
Novita
Novita AI provides access to DeepSeek, Llama, and other open-source models..env
Available models
| LLM_KEY | Notes |
|---|---|
| DeepSeek | |
NOVITA_DEEPSEEK_R1 | Reasoning model |
NOVITA_DEEPSEEK_V3 | |
| Llama 3.3 | |
NOVITA_LLAMA_3_3_70B | |
| Llama 3.2 | |
NOVITA_LLAMA_3_2_11B_VISION | Vision support |
NOVITA_LLAMA_3_2_3B | |
NOVITA_LLAMA_3_2_1B | |
| Llama 3.1 | |
NOVITA_LLAMA_3_1_405B | |
NOVITA_LLAMA_3_1_70B | |
NOVITA_LLAMA_3_1_8B | |
| Llama 3 | |
NOVITA_LLAMA_3_70B | |
NOVITA_LLAMA_3_8B |
Moonshot
Moonshot AI provides the Kimi series models with long context support..env
Available models
| LLM_KEY | Notes |
|---|---|
MOONSHOT_KIMI_K2 |
Optional settings
.env
Inception
Inception AI provides the Mercury series models..env
Available models
| LLM_KEY | Notes |
|---|---|
INCEPTION_MERCURY_2 |
Optional settings
.env
Ollama (Local Models)
Run open-source models locally with Ollama. No API costs, but requires sufficient local compute..env
Setup steps
- Install Ollama
- Pull a model:
ollama pull llama3.1 - Start Ollama:
ollama serve - Configure Skyvern to connect
Docker networking
When running Skyvern in Docker and Ollama on the host:| Host OS | OLLAMA_SERVER_URL |
|---|---|
| macOS/Windows | http://host.docker.internal:11434 |
| Linux | http://172.17.0.1:11434 (Docker bridge IP) |
OpenAI-Compatible Endpoints
Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference..env
- Running local models with a unified API
- Using LiteLLM as a proxy to switch between providers
- Connecting to self-hosted inference servers
OpenRouter
Access multiple models through a single API at openrouter.ai..env
Groq
Inference on open-source models at groq.com..env
Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited.
Using multiple models
Primary and secondary models
Configure a cheaper model for lightweight operations:.env
Task-specific models
For fine-grained control, you can override models for specific operations:.env
LLM_KEY and SECONDARY_LLM_KEY.
Troubleshooting
”To enable svg shape conversion, please set the Secondary LLM key”
Some operations require a secondary model. SetSECONDARY_LLM_KEY in your environment:
.env
“Context window exceeded”
The page content is too large for the model’s context window. Options:- Use a model with larger context support (GPT-5, Gemini 2.5 Pro, or Claude 4.5 Sonnet)
- Simplify your prompt to require less page analysis
- Start from a more specific URL with less content
”LLM caller not found”
The configuredLLM_KEY doesn’t match any enabled provider. Verify:
- The provider is enabled (
ENABLE_OPENAI=true, etc.) - The
LLM_KEYvalue matches a supported model name exactly - Model names are case-sensitive:
OPENAI_GPT4Onotopenai_gpt4o
Container logs show authentication errors
Check your API key configuration:- Ensure the key is set correctly without extra whitespace
- Verify the key hasn’t expired or been revoked
- For Azure, ensure
AZURE_API_BASEincludes the full URL withhttps://
Slow response times
LLM calls typically take 2-10 seconds. Longer times may indicate:- Network latency to the provider
- Rate limiting (the provider may be throttling requests)
- For Ollama, insufficient local compute resources
Next steps
Browser Configuration
Configure browser modes, locales, and display settings
Docker Setup
Return to the main Docker setup guide

