Skip to main content
Skyvern uses LLMs to analyze screenshots and decide what actions to take. You’ll need to configure at least one LLM provider before running tasks.

How Skyvern uses LLMs

Skyvern makes multiple LLM calls per task step:
  1. Screenshot analysis: Identify interactive elements on the page
  2. Action planning: Decide what to click, type, or extract
  3. Result extraction: Parse data from the page into structured output
A task that runs for 10 steps makes roughly 30+ LLM calls. Choose your provider and model tier with this in mind. For most deployments, configure a single provider using LLM_KEY. Skyvern also supports a SECONDARY_LLM_KEY for lighter tasks to reduce costs.

Quick Start Recommendations

Best models for production (2025):
ProviderPrimary ModelSecondary ModelNotes
AnthropicANTHROPIC_CLAUDE4.5_OPUSANTHROPIC_CLAUDE4.5_SONNETMost capable
OpenAIOPENAI_GPT5OPENAI_GPT5_MINILatest
GoogleGEMINI_3_PROGEMINI_3.0_FLASHLatest
AWS BedrockBEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILEBEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILELatest Claude
New in 2025: GPT-5 series, Claude 4.6 Opus, Gemini 3, Amazon Nova, and many new open-source models via Novita and VolcEngine.

OpenAI

The most common choice. Requires an API key from platform.openai.com.
.env
ENABLE_OPENAI=true
OPENAI_API_KEY=sk-...
LLM_KEY=OPENAI_GPT4O

Available models

LLM_KEYNotes
GPT-5 Series
OPENAI_GPT5Recommended for most complex tasks
OPENAI_GPT5_MINI
OPENAI_GPT5_MINI_FLEXFlex service tier, 15min timeout
OPENAI_GPT5_NANO
OPENAI_GPT5_1
OPENAI_GPT5_2
OPENAI_GPT5_4
GPT-4 Series
OPENAI_GPT4O
OPENAI_GPT4O_MINI
OPENAI_GPT4_1
OPENAI_GPT4_1_MINI
OPENAI_GPT4_1_NANO
OPENAI_GPT4_5
OPENAI_GPT4_TURBOLegacy
OPENAI_GPT4VLegacy alias
O-Series (Reasoning)
OPENAI_O4_MINIVision support
OPENAI_O3Vision support
OPENAI_O3_MININo vision

Optional settings

.env
# Use a custom API endpoint (for proxies or compatible services)
OPENAI_API_BASE=https://your-proxy.com/v1

# Specify organization ID
OPENAI_ORGANIZATION=org-...

Anthropic

Claude models from anthropic.com.
.env
ENABLE_ANTHROPIC=true
ANTHROPIC_API_KEY=sk-ant-...
LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET

Available models

LLM_KEYNotes
Claude 4.6
ANTHROPIC_CLAUDE4.6_OPUSNewest
Claude 4.5
ANTHROPIC_CLAUDE4.5_OPUSRecommended for primary use
ANTHROPIC_CLAUDE4.5_SONNETRecommended for secondary use
ANTHROPIC_CLAUDE4.5_HAIKUFastest
Claude 4
ANTHROPIC_CLAUDE4_OPUS
ANTHROPIC_CLAUDE4_SONNET
Claude 3.7
ANTHROPIC_CLAUDE3.7_SONNET
Claude 3.5
ANTHROPIC_CLAUDE3.5_SONNET
ANTHROPIC_CLAUDE3.5_HAIKU
Claude 3 (Legacy)
ANTHROPIC_CLAUDE3_OPUS
ANTHROPIC_CLAUDE3_SONNET
ANTHROPIC_CLAUDE3_HAIKU

Azure OpenAI

Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned.
.env
ENABLE_AZURE=true
LLM_KEY=AZURE_OPENAI
AZURE_DEPLOYMENT=your-deployment-name
AZURE_API_KEY=your-azure-api-key
AZURE_API_BASE=https://your-resource.openai.azure.com/
AZURE_API_VERSION=2024-08-01-preview

Setup steps

  1. Create an Azure OpenAI resource in the Azure Portal
  2. Open the Azure AI Foundry portal from your resource’s overview page
  3. Go to Shared ResourcesDeployments
  4. Click Deploy ModelDeploy Base Model → select GPT-4o or GPT-4
  5. Note the Deployment Name. Use this for AZURE_DEPLOYMENT
  6. Copy your API key and endpoint from the Azure Portal
The AZURE_DEPLOYMENT is the name you chose when deploying the model, not the model name itself.

Google Gemini

Skyvern supports Gemini through two paths: the Gemini API (simpler, uses an API key) and Vertex AI (enterprise, uses a GCP service account).

Gemini API

The quickest way to use Gemini. Get an API key from Google AI Studio.
.env
ENABLE_GEMINI=true
GEMINI_API_KEY=your-gemini-api-key
LLM_KEY=GEMINI_2.5_PRO

Available Gemini API models

LLM_KEYNotes
Gemini 3
GEMINI_3_PRORecommended for primary use
GEMINI_3.0_FLASHRecommended for secondary use
Gemini 2.5
GEMINI_2.5_PRO
GEMINI_2.5_PRO_PREVIEW
GEMINI_2.5_PRO_EXP_03_25Experimental
GEMINI_2.5_FLASH
GEMINI_2.5_FLASH_PREVIEW
Gemini 2.0
GEMINI_FLASH_2_0
GEMINI_FLASH_2_0_LITE
Gemini 1.5 Legacy
GEMINI_PRO
GEMINI_FLASH

Vertex AI

For enterprise deployments through Vertex AI. Requires a GCP project with Vertex AI enabled.
.env
ENABLE_VERTEX_AI=true
LLM_KEY=VERTEX_GEMINI_3_PRO
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
GCP_PROJECT_ID=your-gcp-project-id
VERTEX_LOCATION=us-central1
If you’re migrating from an older Skyvern version, VERTEX_LOCATION replaces the previous GCP_REGION variable. Update your .env accordingly.
Vertex AI setup steps:
  1. Create a GCP project with billing enabled
  2. Enable the Vertex AI API in your project
  3. Create a service account with the Vertex AI User role
  4. Download the service account JSON key file
  5. Set GOOGLE_APPLICATION_CREDENTIALS to the path of that file
For global endpoint access, set VERTEX_LOCATION=global and ensure VERTEX_PROJECT_ID is set. Not all models support the global endpoint.

Available Vertex AI models

LLM_KEYNotes
Gemini 3
VERTEX_GEMINI_3_PRORecommended for primary use
VERTEX_GEMINI_3.0_FLASHRecommended for secondary use
Gemini 2.5
VERTEX_GEMINI_2.5_PRO
VERTEX_GEMINI_2.5_PRO_PREVIEW
VERTEX_GEMINI_2.5_FLASH
VERTEX_GEMINI_2.5_FLASH_LITE
VERTEX_GEMINI_2.5_FLASH_PREVIEW
Gemini 2.0
VERTEX_GEMINI_FLASH_2_0
Gemini 1.5 Legacy
VERTEX_GEMINI_PRO
VERTEX_GEMINI_FLASH

Amazon Bedrock

Run Anthropic Claude through your AWS account.
.env
ENABLE_BEDROCK=true
LLM_KEY=BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET
AWS_REGION=us-west-2
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...

Setup steps

  1. Create an IAM user with AmazonBedrockFullAccess policy
  2. Generate access keys for the IAM user
  3. In the Bedrock console, go to Model Access
  4. Enable access to Claude 3.5 Sonnet

Available models

LLM_KEYNotes
Amazon Nova (AWS Native)
BEDROCK_AMAZON_NOVA_PRO
BEDROCK_AMAZON_NOVA_LITE
Claude 4.6
BEDROCK_ANTHROPIC_CLAUDE4.6_OPUS_INFERENCE_PROFILECross-region
Claude 4.5
BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILECross-region
BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILECross-region
Claude 4
BEDROCK_ANTHROPIC_CLAUDE4_OPUS_INFERENCE_PROFILECross-region
BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILECross-region
Claude 3.7
BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILECross-region
Claude 3.5
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNETv2
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_INFERENCE_PROFILECross-region
BEDROCK_ANTHROPIC_CLAUDE3.5_HAIKU
Claude 3 (Legacy)
BEDROCK_ANTHROPIC_CLAUDE3_OPUS
BEDROCK_ANTHROPIC_CLAUDE3_SONNET
BEDROCK_ANTHROPIC_CLAUDE3_HAIKU
Bedrock inference profile keys (*_INFERENCE_PROFILE) use cross-region inference and require AWS_REGION only. No access keys needed if running on an IAM-authenticated instance.

MiniMax

MiniMax models with vision support.
.env
ENABLE_MINIMAX=true
MINIMAX_API_KEY=your-minimax-api-key
LLM_KEY=MINIMAX_M2_5

Available models

LLM_KEYNotes
MINIMAX_M2_5
MINIMAX_M2_5_HIGHSPEEDFaster variant

Optional settings

.env
# Use a custom API endpoint
MINIMAX_API_BASE=https://api.minimax.io/v1

VolcEngine (ByteDance Doubao)

VolcEngine provides access to ByteDance’s Doubao models with vision support.
.env
ENABLE_VOLCENGINE=true
VOLCENGINE_API_KEY=your-volcengine-api-key
LLM_KEY=VOLCENGINE_DOUBAO_SEED_1_6

Available models

LLM_KEYNotes
VOLCENGINE_DOUBAO_SEED_1_6Recommended for general use
VOLCENGINE_DOUBAO_SEED_1_6_FLASHFaster variant
VOLCENGINE_DOUBAO_1_5_THINKING_VISION_PROReasoning model

Optional settings

.env
# Use a custom API endpoint
VOLCENGINE_API_BASE=https://ark.cn-beijing.volces.com/api/v3

Novita

Novita AI provides access to DeepSeek, Llama, and other open-source models.
.env
ENABLE_NOVITA=true
NOVITA_API_KEY=your-novita-api-key
LLM_KEY=NOVITA_LLAMA_3_2_11B_VISION

Available models

LLM_KEYNotes
DeepSeek
NOVITA_DEEPSEEK_R1Reasoning model
NOVITA_DEEPSEEK_V3
Llama 3.3
NOVITA_LLAMA_3_3_70B
Llama 3.2
NOVITA_LLAMA_3_2_11B_VISIONVision support
NOVITA_LLAMA_3_2_3B
NOVITA_LLAMA_3_2_1B
Llama 3.1
NOVITA_LLAMA_3_1_405B
NOVITA_LLAMA_3_1_70B
NOVITA_LLAMA_3_1_8B
Llama 3
NOVITA_LLAMA_3_70B
NOVITA_LLAMA_3_8B

Moonshot

Moonshot AI provides the Kimi series models with long context support.
.env
ENABLE_MOONSHOT=true
MOONSHOT_API_KEY=your-moonshot-api-key
LLM_KEY=MOONSHOT_KIMI_K2

Available models

LLM_KEYNotes
MOONSHOT_KIMI_K2

Optional settings

.env
# Use a custom API endpoint
MOONSHOT_API_BASE=https://api.moonshot.cn/v1

Inception

Inception AI provides the Mercury series models.
.env
ENABLE_INCEPTION=true
INCEPTION_API_KEY=your-inception-api-key
LLM_KEY=INCEPTION_MERCURY_2

Available models

LLM_KEYNotes
INCEPTION_MERCURY_2

Optional settings

.env
# Use a custom API endpoint
INCEPTION_API_BASE=https://api.inception.ai/v1

Ollama (Local Models)

Run open-source models locally with Ollama. No API costs, but requires sufficient local compute.
.env
ENABLE_OLLAMA=true
LLM_KEY=OLLAMA
OLLAMA_MODEL=llama3.1
OLLAMA_SERVER_URL=http://host.docker.internal:11434
OLLAMA_SUPPORTS_VISION=false

Setup steps

  1. Install Ollama
  2. Pull a model: ollama pull llama3.1
  3. Start Ollama: ollama serve
  4. Configure Skyvern to connect
Most Ollama models don’t support vision. Set OLLAMA_SUPPORTS_VISION=false. Without vision, Skyvern relies on DOM analysis instead of screenshot analysis, which may reduce accuracy on complex pages.

Docker networking

When running Skyvern in Docker and Ollama on the host:
Host OSOLLAMA_SERVER_URL
macOS/Windowshttp://host.docker.internal:11434
Linuxhttp://172.17.0.1:11434 (Docker bridge IP)

OpenAI-Compatible Endpoints

Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference.
.env
ENABLE_OPENAI_COMPATIBLE=true
OPENAI_COMPATIBLE_MODEL_NAME=llama3.1
OPENAI_COMPATIBLE_API_KEY=sk-test
OPENAI_COMPATIBLE_API_BASE=http://localhost:4000/v1
LLM_KEY=OPENAI_COMPATIBLE
This is useful for:
  • Running local models with a unified API
  • Using LiteLLM as a proxy to switch between providers
  • Connecting to self-hosted inference servers

OpenRouter

Access multiple models through a single API at openrouter.ai.
.env
ENABLE_OPENROUTER=true
LLM_KEY=OPENROUTER
OPENROUTER_API_KEY=sk-or-...
OPENROUTER_MODEL=mistralai/mistral-small-3.1-24b-instruct

Groq

Inference on open-source models at groq.com.
.env
ENABLE_GROQ=true
LLM_KEY=GROQ
GROQ_API_KEY=gsk_...
GROQ_MODEL=llama-3.1-8b-instant
Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited.

Using multiple models

Primary and secondary models

Configure a cheaper model for lightweight operations:
.env
# Main model for complex decisions
LLM_KEY=ANTHROPIC_CLAUDE4.5_OPUS
# or: OPENAI_GPT5
# or: GEMINI_3_PRO

# Faster model for simple tasks like dropdown selection
SECONDARY_LLM_KEY=ANTHROPIC_CLAUDE4.5_SONNET
# or: OPENAI_GPT5_MINI
# or: GEMINI_3.0_FLASH
Recommended primary models (latest):
  • Anthropic Claude 4.5 Opus (ANTHROPIC_CLAUDE4.5_OPUS) - Most capable
  • OpenAI GPT-5 (OPENAI_GPT5) - Latest
  • Google Gemini 3 Pro (GEMINI_3_PRO) - Latest
Recommended secondary models (latest):
  • Claude 4.5 Sonnet (ANTHROPIC_CLAUDE4.5_SONNET) - Balanced
  • GPT-5 Mini (OPENAI_GPT5_MINI) - Faster GPT-5
  • Gemini 3.0 Flash (GEMINI_3.0_FLASH) - Faster Gemini 3

Task-specific models

For fine-grained control, you can override models for specific operations:
.env
# Model for data extraction from pages (defaults to LLM_KEY if not set)
EXTRACTION_LLM_KEY=ANTHROPIC_CLAUDE4.5_SONNET

# Model for generating code/scripts in code blocks (defaults to LLM_KEY if not set)
SCRIPT_GENERATION_LLM_KEY=OPENAI_GPT5
Most deployments don’t need task-specific models. Start with LLM_KEY and SECONDARY_LLM_KEY.

Troubleshooting

”To enable svg shape conversion, please set the Secondary LLM key”

Some operations require a secondary model. Set SECONDARY_LLM_KEY in your environment:
.env
SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI

“Context window exceeded”

The page content is too large for the model’s context window. Options:
  • Use a model with larger context support (GPT-5, Gemini 2.5 Pro, or Claude 4.5 Sonnet)
  • Simplify your prompt to require less page analysis
  • Start from a more specific URL with less content

”LLM caller not found”

The configured LLM_KEY doesn’t match any enabled provider. Verify:
  1. The provider is enabled (ENABLE_OPENAI=true, etc.)
  2. The LLM_KEY value matches a supported model name exactly
  3. Model names are case-sensitive: OPENAI_GPT4O not openai_gpt4o

Container logs show authentication errors

Check your API key configuration:
  • Ensure the key is set correctly without extra whitespace
  • Verify the key hasn’t expired or been revoked
  • For Azure, ensure AZURE_API_BASE includes the full URL with https://

Slow response times

LLM calls typically take 2-10 seconds. Longer times may indicate:
  • Network latency to the provider
  • Rate limiting (the provider may be throttling requests)
  • For Ollama, insufficient local compute resources

Next steps

Browser Configuration

Configure browser modes, locales, and display settings

Docker Setup

Return to the main Docker setup guide