How Skyvern uses LLMs
Skyvern makes multiple LLM calls per task step:- Screenshot analysis: Identify interactive elements on the page
- Action planning: Decide what to click, type, or extract
- Result extraction: Parse data from the page into structured output
LLM_KEY. Skyvern also supports a SECONDARY_LLM_KEY for lighter tasks to reduce costs.
OpenAI
The most common choice. Requires an API key from platform.openai.com..env
Available models
| LLM_KEY | Model | Notes |
|---|---|---|
OPENAI_GPT4O | gpt-4o | Recommended for most use cases |
OPENAI_GPT4O_MINI | gpt-4o-mini | Cheaper, less capable |
OPENAI_GPT4_1 | gpt-4.1 | Latest GPT-4 family |
OPENAI_GPT4_1_MINI | gpt-4.1-mini | Cheaper GPT-4.1 variant |
OPENAI_O3 | o3 | Reasoning model |
OPENAI_O3_MINI | o3-mini | Cheaper reasoning model |
OPENAI_GPT4_TURBO | gpt-4-turbo | Previous generation |
OPENAI_GPT4V | gpt-4-turbo | Legacy alias for gpt-4-turbo |
Optional settings
.env
Anthropic
Claude models from anthropic.com..env
Available models
| LLM_KEY | Model | Notes |
|---|---|---|
ANTHROPIC_CLAUDE4.5_SONNET | claude-4.5-sonnet | Latest Sonnet |
ANTHROPIC_CLAUDE4.5_OPUS | claude-4.5-opus | Most capable |
ANTHROPIC_CLAUDE4_SONNET | claude-4-sonnet | Claude 4 |
ANTHROPIC_CLAUDE4_OPUS | claude-4-opus | Claude 4 Opus |
ANTHROPIC_CLAUDE3.7_SONNET | claude-3-7-sonnet | Previous generation |
ANTHROPIC_CLAUDE3.5_SONNET | claude-3-5-sonnet | Previous generation |
ANTHROPIC_CLAUDE3.5_HAIKU | claude-3-5-haiku | Cheap and fast |
Azure OpenAI
Microsoft-hosted OpenAI models. Requires an Azure subscription with OpenAI service provisioned..env
Setup steps
- Create an Azure OpenAI resource in the Azure Portal
- Open the Azure AI Foundry portal from your resource’s overview page
- Go to Shared Resources → Deployments
- Click Deploy Model → Deploy Base Model → select GPT-4o or GPT-4
- Note the Deployment Name. Use this for
AZURE_DEPLOYMENT - Copy your API key and endpoint from the Azure Portal
The
AZURE_DEPLOYMENT is the name you chose when deploying the model, not the model name itself.Google Gemini
Skyvern supports Gemini through two paths: the Gemini API (simpler, uses an API key) and Vertex AI (enterprise, uses a GCP service account).Gemini API
The quickest way to use Gemini. Get an API key from Google AI Studio..env
Vertex AI
For enterprise deployments through Vertex AI. Requires a GCP project with Vertex AI enabled..env
- Create a GCP project with billing enabled
- Enable the Vertex AI API in your project
- Create a service account with the Vertex AI User role
- Download the service account JSON key file
- Set
GOOGLE_APPLICATION_CREDENTIALSto the path of that file
Available models
| LLM_KEY | Model | Notes |
|---|---|---|
VERTEX_GEMINI_3.0_FLASH | gemini-3-flash-preview | Recommended |
VERTEX_GEMINI_2.5_PRO | gemini-2.5-pro | Stable |
VERTEX_GEMINI_2.5_FLASH | gemini-2.5-flash | Cheaper, faster |
Amazon Bedrock
Run Anthropic Claude through your AWS account..env
Setup steps
- Create an IAM user with
AmazonBedrockFullAccesspolicy - Generate access keys for the IAM user
- In the Bedrock console, go to Model Access
- Enable access to Claude 3.5 Sonnet
Available models
| LLM_KEY | Model |
|---|---|
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET | Claude 3.5 Sonnet v2 |
BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1 | Claude 3.5 Sonnet v1 |
BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILE | Claude 3.7 Sonnet (cross-region) |
BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILE | Claude 4 Sonnet (cross-region) |
BEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILE | Claude 4.5 Sonnet (cross-region) |
Bedrock inference profile keys (
*_INFERENCE_PROFILE) use cross-region inference and require AWS_REGION only. No access keys needed if running on an IAM-authenticated instance.Ollama (Local Models)
Run open-source models locally with Ollama. No API costs, but requires sufficient local compute..env
Setup steps
- Install Ollama
- Pull a model:
ollama pull llama3.1 - Start Ollama:
ollama serve - Configure Skyvern to connect
Docker networking
When running Skyvern in Docker and Ollama on the host:| Host OS | OLLAMA_SERVER_URL |
|---|---|
| macOS/Windows | http://host.docker.internal:11434 |
| Linux | http://172.17.0.1:11434 (Docker bridge IP) |
OpenAI-Compatible Endpoints
Connect to any service that implements the OpenAI API format, including LiteLLM, LocalAI, vLLM, and text-generation-inference..env
- Running local models with a unified API
- Using LiteLLM as a proxy to switch between providers
- Connecting to self-hosted inference servers
OpenRouter
Access multiple models through a single API at openrouter.ai..env
Groq
Inference on open-source models at groq.com..env
Groq specializes in fast inference for open-source models. Response times are typically much faster than other providers, but model selection is limited.
Using multiple models
Primary and secondary models
Configure a cheaper model for lightweight operations:.env
Task-specific models
For fine-grained control, you can override models for specific operations:.env
LLM_KEY and SECONDARY_LLM_KEY.
Troubleshooting
”To enable svg shape conversion, please set the Secondary LLM key”
Some operations require a secondary model. SetSECONDARY_LLM_KEY in your environment:
.env
“Context window exceeded”
The page content is too large for the model’s context window. Options:- Use a model with a larger context (GPT-4o supports 128k tokens)
- Simplify your prompt to require less page analysis
- Start from a more specific URL with less content
”LLM caller not found”
The configuredLLM_KEY doesn’t match any enabled provider. Verify:
- The provider is enabled (
ENABLE_OPENAI=true, etc.) - The
LLM_KEYvalue matches a supported model name exactly - Model names are case-sensitive:
OPENAI_GPT4Onotopenai_gpt4o
Container logs show authentication errors
Check your API key configuration:- Ensure the key is set correctly without extra whitespace
- Verify the key hasn’t expired or been revoked
- For Azure, ensure
AZURE_API_BASEincludes the full URL withhttps://
Slow response times
LLM calls typically take 2-10 seconds. Longer times may indicate:- Network latency to the provider
- Rate limiting (the provider may be throttling requests)
- For Ollama, insufficient local compute resources
Next steps
Browser Configuration
Configure browser modes, locales, and display settings
Docker Setup
Return to the main Docker setup guide

