LLM Configuration

Skyvern uses LLMs to analyze screenshots and decide what actions to take. You’ll need to configure at least one LLM provider before running tasks.

How Skyvern uses LLMs

Skyvern makes multiple LLM calls per task step:

Screenshot analysis: Identify interactive elements on the page
Action planning: Decide what to click, type, or extract
Result extraction: Parse data from the page into structured output

A task that runs for 10 steps makes roughly 30+ LLM calls. Choose your provider and model tier with this in mind. For most deployments, configure a single provider using LLM_KEY. Skyvern also supports a SECONDARY_LLM_KEY for lighter tasks to reduce costs.

OpenAI

The most common choice. Requires an API key from platform.openai.com.

.env

ENABLE_OPENAI=true
OPENAI_API_KEY=sk-...
LLM_KEY=OPENAI_GPT4O

Available models

LLM_KEY	Model	Notes
`OPENAI_GPT4O`	gpt-4o	Recommended for most use cases
`OPENAI_GPT4O_MINI`	gpt-4o-mini	Cheaper, less capable
`OPENAI_GPT4_1`	gpt-4.1	Latest GPT-4 family
`OPENAI_GPT4_1_MINI`	gpt-4.1-mini	Cheaper GPT-4.1 variant
`OPENAI_O3`	o3	Reasoning model
`OPENAI_O3_MINI`	o3-mini	Cheaper reasoning model
`OPENAI_GPT4_TURBO`	gpt-4-turbo	Previous generation
`OPENAI_GPT4V`	gpt-4-turbo	Legacy alias for gpt-4-turbo

Optional settings

.env

# Use a custom API endpoint (for proxies or compatible services)
OPENAI_API_BASE=https://your-proxy.com/v1

# Specify organization ID
OPENAI_ORGANIZATION=org-...

Anthropic

Claude models from anthropic.com.

.env

ENABLE_ANTHROPIC=true
ANTHROPIC_API_KEY=sk-ant-...
LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET

Available models

LLM_KEY	Model	Notes
`ANTHROPIC_CLAUDE4.5_SONNET`	claude-4.5-sonnet	Latest Sonnet
`ANTHROPIC_CLAUDE4.5_OPUS`	claude-4.5-opus	Most capable
`ANTHROPIC_CLAUDE4_SONNET`	claude-4-sonnet	Claude 4
`ANTHROPIC_CLAUDE4_OPUS`	claude-4-opus	Claude 4 Opus
`ANTHROPIC_CLAUDE3.7_SONNET`	claude-3-7-sonnet	Previous generation
`ANTHROPIC_CLAUDE3.5_SONNET`	claude-3-5-sonnet	Previous generation
`ANTHROPIC_CLAUDE3.5_HAIKU`	claude-3-5-haiku	Cheap and fast

Azure OpenAI

Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned.

.env

ENABLE_AZURE=true
LLM_KEY=AZURE_OPENAI
AZURE_DEPLOYMENT=your-deployment-name
AZURE_API_KEY=your-azure-api-key
AZURE_API_BASE=https://your-resource.openai.azure.com/
AZURE_API_VERSION=2024-08-01-preview

Setup steps

Create an Azure OpenAI resource in the Azure Portal
Open the Azure AI Foundry portal from your resource’s overview page
Go to Shared Resources → Deployments
Click Deploy Model → Deploy Base Model → select GPT-4o or GPT-4
Note the Deployment Name. Use this for AZURE_DEPLOYMENT
Copy your API key and endpoint from the Azure Portal

The AZURE_DEPLOYMENT is the name you chose when deploying the model, not the model name itself.

Google Gemini

Skyvern supports Gemini through two paths: the Gemini API (simpler, uses an API key) and Vertex AI (enterprise, uses a GCP service account).

Gemini API

The quickest way to use Gemini. Get an API key from Google AI Studio.

.env

ENABLE_GEMINI=true
GEMINI_API_KEY=your-gemini-api-key
LLM_KEY=VERTEX_GEMINI_2.5_FLASH

Vertex AI

For enterprise deployments through Vertex AI. Requires a GCP project with Vertex AI enabled.

.env

ENABLE_VERTEX_AI=true
LLM_KEY=VERTEX_GEMINI_3.0_FLASH
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
GCP_PROJECT_ID=your-gcp-project-id
GCP_REGION=us-central1

Vertex AI setup steps:

Create a GCP project with billing enabled
Enable the Vertex AI API in your project
Create a service account with the Vertex AI User role
Download the service account JSON key file
Set GOOGLE_APPLICATION_CREDENTIALS to the path of that file

Available models

LLM_KEY	Model	Notes
`VERTEX_GEMINI_3.0_FLASH`	gemini-3-flash-preview	Recommended
`VERTEX_GEMINI_2.5_PRO`	gemini-2.5-pro	Stable
`VERTEX_GEMINI_2.5_FLASH`	gemini-2.5-flash	Cheaper, faster

Amazon Bedrock

Run Anthropic Claude through your AWS account.

.env

ENABLE_BEDROCK=true
LLM_KEY=BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET
AWS_REGION=us-west-2
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...

Setup steps

Create an IAM user with AmazonBedrockFullAccess policy
Generate access keys for the IAM user
In the Bedrock console, go to Model Access
Enable access to Claude 3.5 Sonnet

Available models

LLM_KEY	Model
`BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET`	Claude 3.5 Sonnet v2
`BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1`	Claude 3.5 Sonnet v1
`BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILE`	Claude 3.7 Sonnet (cross-region)
`BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILE`	Claude 4 Sonnet (cross-region)
`BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE`	Claude 4.5 Sonnet (cross-region)

Bedrock inference profile keys (*_INFERENCE_PROFILE) use cross-region inference and require AWS_REGION only. No access keys needed if running on an IAM-authenticated instance.

Ollama (Local Models)

Run open-source models locally with Ollama. No API costs, but requires sufficient local compute.

.env

ENABLE_OLLAMA=true
LLM_KEY=OLLAMA
OLLAMA_MODEL=llama3.1
OLLAMA_SERVER_URL=http://host.docker.internal:11434
OLLAMA_SUPPORTS_VISION=false

Setup steps

Install Ollama
Pull a model: ollama pull llama3.1
Start Ollama: ollama serve
Configure Skyvern to connect

Most Ollama models don’t support vision. Set OLLAMA_SUPPORTS_VISION=false. Without vision, Skyvern relies on DOM analysis instead of screenshot analysis, which may reduce accuracy on complex pages.

Docker networking

When running Skyvern in Docker and Ollama on the host:

Host OS	OLLAMA_SERVER_URL
macOS/Windows	`http://host.docker.internal:11434`
Linux	`http://172.17.0.1:11434` (Docker bridge IP)

OpenAI-Compatible Endpoints

Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference.

.env

ENABLE_OPENAI_COMPATIBLE=true
OPENAI_COMPATIBLE_MODEL_NAME=llama3.1
OPENAI_COMPATIBLE_API_KEY=sk-test
OPENAI_COMPATIBLE_API_BASE=http://localhost:4000/v1
LLM_KEY=OPENAI_COMPATIBLE

This is useful for:

Running local models with a unified API
Using LiteLLM as a proxy to switch between providers
Connecting to self-hosted inference servers

OpenRouter

Access multiple models through a single API at openrouter.ai.

.env

ENABLE_OPENROUTER=true
LLM_KEY=OPENROUTER
OPENROUTER_API_KEY=sk-or-...
OPENROUTER_MODEL=mistralai/mistral-small-3.1-24b-instruct

Groq

Inference on open-source models at groq.com.

.env

ENABLE_GROQ=true
LLM_KEY=GROQ
GROQ_API_KEY=gsk_...
GROQ_MODEL=llama-3.1-8b-instant

Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited.

Using multiple models

Primary and secondary models

Configure a cheaper model for lightweight operations:

.env

# Main model for complex decisions
LLM_KEY=OPENAI_GPT4O

# Cheaper model for simple tasks like dropdown selection
SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI

Task-specific models

For fine-grained control, you can override models for specific operations:

.env

# Model for data extraction from pages (defaults to LLM_KEY if not set)
EXTRACTION_LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET

# Model for generating code/scripts in code blocks (defaults to LLM_KEY if not set)
SCRIPT_GENERATION_LLM_KEY=OPENAI_GPT4O

Most deployments don’t need task-specific models. Start with LLM_KEY and SECONDARY_LLM_KEY.

Troubleshooting

”To enable svg shape conversion, please set the Secondary LLM key”

Some operations require a secondary model. Set SECONDARY_LLM_KEY in your environment:

.env

SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI

“Context window exceeded”

The page content is too large for the model’s context window. Options:

Use a model with a larger context (GPT-4o supports 128k tokens)
Simplify your prompt to require less page analysis
Start from a more specific URL with less content

”LLM caller not found”

The configured LLM_KEY doesn’t match any enabled provider. Verify:

The provider is enabled (ENABLE_OPENAI=true, etc.)
The LLM_KEY value matches a supported model name exactly
Model names are case-sensitive: OPENAI_GPT4O not openai_gpt4o

Container logs show authentication errors

Check your API key configuration:

Ensure the key is set correctly without extra whitespace
Verify the key hasn’t expired or been revoked
For Azure, ensure AZURE_API_BASE includes the full URL with https://

Slow response times

LLM calls typically take 2-10 seconds. Longer times may indicate:

Network latency to the provider
Rate limiting (the provider may be throttling requests)
For Ollama, insufficient local compute resources

Next steps

Browser Configuration

Configure browser modes, locales, and display settings

Docker Setup

Return to the main Docker setup guide

Getting Started

Running Automations

Multi-Step Automations

Optimization

Going to Production

Debugging

Self-Hosted Deployment

​How Skyvern uses LLMs

​OpenAI

​Available models

​Optional settings

​Anthropic

​Available models

​Azure OpenAI

​Setup steps

​Google Gemini

​Gemini API

​Vertex AI

​Available models

​Amazon Bedrock

​Setup steps

​Available models

​Ollama (Local Models)

​Setup steps

​Docker networking

​OpenAI-Compatible Endpoints

​OpenRouter

​Groq

​Using multiple models

​Primary and secondary models

​Task-specific models

​Troubleshooting

​”To enable svg shape conversion, please set the Secondary LLM key”

​“Context window exceeded”

​”LLM caller not found”

​Container logs show authentication errors

​Slow response times

​Next steps

Browser Configuration

Docker Setup

How Skyvern uses LLMs

OpenAI

Available models

Optional settings

Anthropic

Available models

Azure OpenAI

Setup steps

Google Gemini

Gemini API

Vertex AI

Available models

Amazon Bedrock

Setup steps

Available models

Ollama (Local Models)

Setup steps

Docker networking

OpenAI-Compatible Endpoints

OpenRouter

Groq

Using multiple models

Primary and secondary models

Task-specific models

Troubleshooting

”To enable svg shape conversion, please set the Secondary LLM key”

“Context window exceeded”

”LLM caller not found”

Container logs show authentication errors

Slow response times

Next steps