Skip to main content
Skyvern uses LLMs to analyze screenshots and decide what actions to take. You’ll need to configure at least one LLM provider before running tasks.

How Skyvern uses LLMs

Skyvern makes multiple LLM calls per task step:
  1. Screenshot analysis: Identify interactive elements on the page
  2. Action planning: Decide what to click, type, or extract
  3. Result extraction: Parse data from the page into structured output
A task that runs for 10 steps makes roughly 30+ LLM calls. Choose your provider and model tier with this in mind. For most deployments, configure a single provider using LLM_KEY. Skyvern also supports a SECONDARY_LLM_KEY for lighter tasks to reduce costs.

OpenAI

The most common choice. Requires an API key from platform.openai.com.
.env
ENABLE_OPENAI=true
OPENAI_API_KEY=sk-...
LLM_KEY=OPENAI_GPT4O

Available models

LLM_KEYModelNotes
OPENAI_GPT4Ogpt-4oRecommended for most use cases
OPENAI_GPT4O_MINIgpt-4o-miniCheaper, less capable
OPENAI_GPT4_1gpt-4.1Latest GPT-4 family
OPENAI_GPT4_1_MINIgpt-4.1-miniCheaper GPT-4.1 variant
OPENAI_O3o3Reasoning model
OPENAI_O3_MINIo3-miniCheaper reasoning model
OPENAI_GPT4_TURBOgpt-4-turboPrevious generation
OPENAI_GPT4Vgpt-4-turboLegacy alias for gpt-4-turbo

Optional settings

.env
# Use a custom API endpoint (for proxies or compatible services)
OPENAI_API_BASE=https://your-proxy.com/v1

# Specify organization ID
OPENAI_ORGANIZATION=org-...

Anthropic

Claude models from anthropic.com.
.env
ENABLE_ANTHROPIC=true
ANTHROPIC_API_KEY=sk-ant-...
LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET

Available models

LLM_KEYModelNotes
ANTHROPIC_CLAUDE4.5_SONNETclaude-4.5-sonnetLatest Sonnet
ANTHROPIC_CLAUDE4.5_OPUSclaude-4.5-opusMost capable
ANTHROPIC_CLAUDE4_SONNETclaude-4-sonnetClaude 4
ANTHROPIC_CLAUDE4_OPUSclaude-4-opusClaude 4 Opus
ANTHROPIC_CLAUDE3.7_SONNETclaude-3-7-sonnetPrevious generation
ANTHROPIC_CLAUDE3.5_SONNETclaude-3-5-sonnetPrevious generation
ANTHROPIC_CLAUDE3.5_HAIKUclaude-3-5-haikuCheap and fast

Azure OpenAI

Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned.
.env
ENABLE_AZURE=true
LLM_KEY=AZURE_OPENAI
AZURE_DEPLOYMENT=your-deployment-name
AZURE_API_KEY=your-azure-api-key
AZURE_API_BASE=https://your-resource.openai.azure.com/
AZURE_API_VERSION=2024-08-01-preview

Setup steps

  1. Create an Azure OpenAI resource in the Azure Portal
  2. Open the Azure AI Foundry portal from your resource’s overview page
  3. Go to Shared ResourcesDeployments
  4. Click Deploy ModelDeploy Base Model → select GPT-4o or GPT-4
  5. Note the Deployment Name. Use this for AZURE_DEPLOYMENT
  6. Copy your API key and endpoint from the Azure Portal
The AZURE_DEPLOYMENT is the name you chose when deploying the model, not the model name itself.

Google Gemini

Skyvern supports Gemini through two paths: the Gemini API (simpler, uses an API key) and Vertex AI (enterprise, uses a GCP service account).

Gemini API

The quickest way to use Gemini. Get an API key from Google AI Studio.
.env
ENABLE_GEMINI=true
GEMINI_API_KEY=your-gemini-api-key
LLM_KEY=VERTEX_GEMINI_2.5_FLASH

Vertex AI

For enterprise deployments through Vertex AI. Requires a GCP project with Vertex AI enabled.
.env
ENABLE_VERTEX_AI=true
LLM_KEY=VERTEX_GEMINI_3.0_FLASH
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
GCP_PROJECT_ID=your-gcp-project-id
GCP_REGION=us-central1
Vertex AI setup steps:
  1. Create a GCP project with billing enabled
  2. Enable the Vertex AI API in your project
  3. Create a service account with the Vertex AI User role
  4. Download the service account JSON key file
  5. Set GOOGLE_APPLICATION_CREDENTIALS to the path of that file

Available models

LLM_KEYModelNotes
VERTEX_GEMINI_3.0_FLASHgemini-3-flash-previewRecommended
VERTEX_GEMINI_2.5_PROgemini-2.5-proStable
VERTEX_GEMINI_2.5_FLASHgemini-2.5-flashCheaper, faster

Amazon Bedrock

Run Anthropic Claude through your AWS account.
.env
ENABLE_BEDROCK=true
LLM_KEY=BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET
AWS_REGION=us-west-2
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...

Setup steps

  1. Create an IAM user with AmazonBedrockFullAccess policy
  2. Generate access keys for the IAM user
  3. In the Bedrock console, go to Model Access
  4. Enable access to Claude 3.5 Sonnet

Available models

LLM_KEYModel
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNETClaude 3.5 Sonnet v2
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1Claude 3.5 Sonnet v1
BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILEClaude 3.7 Sonnet (cross-region)
BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILEClaude 4 Sonnet (cross-region)
BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILEClaude 4.5 Sonnet (cross-region)
Bedrock inference profile keys (*_INFERENCE_PROFILE) use cross-region inference and require AWS_REGION only. No access keys needed if running on an IAM-authenticated instance.

Ollama (Local Models)

Run open-source models locally with Ollama. No API costs, but requires sufficient local compute.
.env
ENABLE_OLLAMA=true
LLM_KEY=OLLAMA
OLLAMA_MODEL=llama3.1
OLLAMA_SERVER_URL=http://host.docker.internal:11434
OLLAMA_SUPPORTS_VISION=false

Setup steps

  1. Install Ollama
  2. Pull a model: ollama pull llama3.1
  3. Start Ollama: ollama serve
  4. Configure Skyvern to connect
Most Ollama models don’t support vision. Set OLLAMA_SUPPORTS_VISION=false. Without vision, Skyvern relies on DOM analysis instead of screenshot analysis, which may reduce accuracy on complex pages.

Docker networking

When running Skyvern in Docker and Ollama on the host:
Host OSOLLAMA_SERVER_URL
macOS/Windowshttp://host.docker.internal:11434
Linuxhttp://172.17.0.1:11434 (Docker bridge IP)

OpenAI-Compatible Endpoints

Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference.
.env
ENABLE_OPENAI_COMPATIBLE=true
OPENAI_COMPATIBLE_MODEL_NAME=llama3.1
OPENAI_COMPATIBLE_API_KEY=sk-test
OPENAI_COMPATIBLE_API_BASE=http://localhost:4000/v1
LLM_KEY=OPENAI_COMPATIBLE
This is useful for:
  • Running local models with a unified API
  • Using LiteLLM as a proxy to switch between providers
  • Connecting to self-hosted inference servers

OpenRouter

Access multiple models through a single API at openrouter.ai.
.env
ENABLE_OPENROUTER=true
LLM_KEY=OPENROUTER
OPENROUTER_API_KEY=sk-or-...
OPENROUTER_MODEL=mistralai/mistral-small-3.1-24b-instruct

Groq

Inference on open-source models at groq.com.
.env
ENABLE_GROQ=true
LLM_KEY=GROQ
GROQ_API_KEY=gsk_...
GROQ_MODEL=llama-3.1-8b-instant
Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited.

Using multiple models

Primary and secondary models

Configure a cheaper model for lightweight operations:
.env
# Main model for complex decisions
LLM_KEY=OPENAI_GPT4O

# Cheaper model for simple tasks like dropdown selection
SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI

Task-specific models

For fine-grained control, you can override models for specific operations:
.env
# Model for data extraction from pages (defaults to LLM_KEY if not set)
EXTRACTION_LLM_KEY=ANTHROPIC_CLAUDE3.5_SONNET

# Model for generating code/scripts in code blocks (defaults to LLM_KEY if not set)
SCRIPT_GENERATION_LLM_KEY=OPENAI_GPT4O
Most deployments don’t need task-specific models. Start with LLM_KEY and SECONDARY_LLM_KEY.

Troubleshooting

”To enable svg shape conversion, please set the Secondary LLM key”

Some operations require a secondary model. Set SECONDARY_LLM_KEY in your environment:
.env
SECONDARY_LLM_KEY=OPENAI_GPT4O_MINI

“Context window exceeded”

The page content is too large for the model’s context window. Options:
  • Use a model with a larger context (GPT-4o supports 128k tokens)
  • Simplify your prompt to require less page analysis
  • Start from a more specific URL with less content

”LLM caller not found”

The configured LLM_KEY doesn’t match any enabled provider. Verify:
  1. The provider is enabled (ENABLE_OPENAI=true, etc.)
  2. The LLM_KEY value matches a supported model name exactly
  3. Model names are case-sensitive: OPENAI_GPT4O not openai_gpt4o

Container logs show authentication errors

Check your API key configuration:
  • Ensure the key is set correctly without extra whitespace
  • Verify the key hasn’t expired or been revoked
  • For Azure, ensure AZURE_API_BASE includes the full URL with https://

Slow response times

LLM calls typically take 2-10 seconds. Longer times may indicate:
  • Network latency to the provider
  • Rate limiting (the provider may be throttling requests)
  • For Ollama, insufficient local compute resources

Next steps

Browser Configuration

Configure browser modes, locales, and display settings

Docker Setup

Return to the main Docker setup guide