Writing robust prompts
Anatomy of a good prompt
Every prompt needs four components:- Main goal (required) — What should happen
- Guardrails — Constraints and boundaries
- Payload — Data to use in form fields
- Completion criteria (required) — How to know when done
What works
Be explicit about completion. The AI needs to know when to stop. “COMPLETE when you see ‘Order confirmed’” is unambiguous. Without this, the AI might keep navigating or stop too early. Use visual descriptions. “Click the blue Submit button at the bottom of the form” works better than “click Submit”—there might be multiple buttons, and visual context helps the AI pick the right one. Describe position, color, icons, and surrounding text. Start general, then refine. Begin with simple prompts and add specifics based on failures. Over-specified prompts are brittle; the AI handles variation better when you describe goals rather than exact steps. Include termination criteria. Tell the AI when to give up: “TERMINATE if login fails or account is locked.” Without this, the AI might keep trying forever or fail silently. Reference visual indicators. “The invoice download link has a PDF icon next to it” helps the AI identify the right element when there are multiple links with similar text.What doesn’t work
Vague goals. “Do the thing on the website” gives the AI nothing to work with. Be specific about what outcome you want. Missing completion criteria. Without knowing when to stop, the AI keeps navigating indefinitely or terminates at arbitrary points. Action lists without context. “Click button A, then B, then C” breaks when the layout changes or buttons move. Describe the goal instead, and let the AI figure out the steps. HTML element names. “Click the<button id='submit'> element” assumes IDs are stable and visible to the AI. They often change between deployments, and the AI works from visual screenshots, not DOM structure.
Assuming page state. The AI doesn’t know what page you expect to be on. Always describe what you expect to see, not just what to do.
Autocomplete and dropdown fields
Address fields, city selectors, and other autocomplete inputs are notoriously tricky. The AI might type a value, see suggestions appear, then second-guess itself—erasing and retyping, cycling through options like “Toronto” → “Ontario” → “Canada” → “Toronto” again before settling. To reduce cycling behavior: Be explicit about the exact value. Instead of “enter the city,” say “type ‘Toronto’ and select it from the dropdown.” Tell it to stop after selection. Add “Once you’ve selected the value, move to the next field—don’t modify it further.” Specify which suggestion to pick. “Select the first suggestion that matches” or “Choose the option that shows the full address” reduces ambiguity.Choosing the right block type
Skyvern offers three block types with different tradeoffs between reliability and flexibility. The more interpretation you ask the AI to do, the more room for unexpected behavior. Action blocks are the most deterministic. You tell Skyvern exactly what to do: “Click the Submit button.” There’s no interpretation—it either finds the element and clicks it, or fails. Use these when you know exactly what action is needed and the page structure is predictable. Navigation blocks give Skyvern a goal: “Fill out the registration form.” The AI figures out which fields to fill and in what order. This handles variation in form layouts—fields might be in different positions, have different labels, or be split across tabs—but the AI can misinterpret ambiguous forms or fill fields you didn’t intend. Task blocks (Navigation V2) handle multi-step goals: “Log in, navigate to settings, update profile.” Maximum flexibility for complex workflows, but more room for the AI to take unexpected paths. A task block might navigate through menus you didn’t anticipate or skip steps it deems unnecessary. Rule of thumb: Start with the most deterministic block that can accomplish your goal. If a single click solves the problem, don’t use a Task block—you’re adding unnecessary interpretation where errors can creep in. Reserve Task blocks for genuinely multi-step workflows where you can’t predict the exact sequence of actions.Handling dynamic pages
Lazy-loaded content
For pages that load content as you scroll:Popups and modals
Include handling instructions in your prompt:Multi-step forms
For forms spread across multiple pages:Validation strategies
Workflows can silently produce wrong results—the AI might fill a form with incorrect data, navigate to the wrong page, or extract stale information. Validation blocks let you assert conditions at critical points and fail fast when something goes wrong, rather than discovering the problem downstream.Add validation blocks at critical points
After login: Verify you’re actually authenticated before proceeding. A failed login might redirect to an error page or show a CAPTCHA, and subsequent blocks will fail in confusing ways if you don’t catch this early.Use extraction to verify
For validation that requires exact matching (email addresses, confirmation numbers, prices), extract the values and compare programmatically in your code rather than relying on the AI’s judgment:ForLoop reliability
Loops are especially prone to cascading failures—when one iteration fails, it can leave the browser in an unexpected state that breaks subsequent iterations. For example, if iteration 3 navigates to an error page and fails, iteration 4 starts from that error page instead of the expected list view, causing it to fail too. One bad item can take down your entire loop.Always set continue_on_failure
Add a reset block
Each iteration should start from a known state. If iteration 3 fails on a detail page, iteration 4 needs to navigate back to the list before it can find its item. Add a reset block at the start of each loop iteration:Include termination criteria per iteration
Session timeouts and human-in-the-loop
Browser sessions have a default timeout. If you’re using human-in-the-loop workflows where a person must approve an action before proceeding, the session can expire while waiting for approval—especially if the approver is slow or in a different timezone. Symptoms: Runs that worked during testing fail in production with session expiry errors. The workflow waits for human approval, but by the time approval comes, the browser session has timed out. Solutions:- Increase session timeout — Set a longer
timeoutwhen creating browser sessions for workflows that include human approval steps - Remove the human step for time-sensitive flows — If session expiry is causing failures, consider making the workflow fully automated and reviewing results after completion
- Split into multiple workflows — Run the pre-approval steps, wait for human approval outside Skyvern, then trigger a second workflow for post-approval steps using a browser profile to maintain login state
For workflows with predictable human approval times, schedule runs to arrive in the approver’s inbox during working hours when they can act quickly.
Keyboard actions and workarounds
Direct keyboard shortcuts (Ctrl+C, Alt+Tab, Esc, etc.) are not currently supported. Here are workarounds for common scenarios.
| Scenario | Workaround |
|---|---|
| Copy text | Use extraction block to get the text value |
| Paste into field | Pass the value as a parameter: {{value_to_paste}} |
| Press Escape to close modal | Click the X button or “Close” link instead. Add to prompt: “If a modal appears, close it by clicking the X button in the top right corner” |
| Keyboard navigation (Tab, arrows) | Describe the click target visually instead |
| Ctrl+S to save | Click the Save button in the UI, or look for auto-save indicators |
| Hotkey-only features | Look for menu alternatives, toolbar buttons, or right-click context menus |
| Tab between fields | AI handles field navigation automatically—no workaround needed |
| Enter to submit | Explicitly click the Submit button rather than relying on Enter key |
- Use the Code block with custom Playwright scripts—you can call
page.keyboard.press('Escape')directly - Check for an API — Many web apps have APIs that bypass the UI entirely
- Use browser profiles with pre-configured settings or extensions that add UI buttons for keyboard-only features
Troubleshooting workflow
When a run fails:- Check the recording — Watch what actually happened
- Review screenshots — See the state at each step
- Check LLM reasoning — Understand why the AI made each decision
- Compare parameters — Verify inputs match expectations
Common fixes
| Symptom | Likely cause | Fix |
|---|---|---|
| Completed too early | Ambiguous completion criteria | Add specific visual indicators. Include “click Submit AND see confirmation” |
| Completed without submitting | Submit button not included in goal | Explicitly state that clicking Submit is part of the task |
| Didn’t complete | Missing completion criteria | Add explicit COMPLETE condition |
| Wrong element clicked | Multiple similar elements | Add distinguishing details: position, color, surrounding text |
| Form field skipped | Field not visible or labeled differently | Describe the field’s visual position |
| Loop stuck | No reset between iterations | Add reset block at loop start |
| Autocomplete cycling | AI second-guessing dropdown selections | Add “select and move on—don’t modify” to prompt |
| Task navigates too far | Task block continues until goal complete | Use Action/Navigation block for single-step actions |
| Session expired | Human-in-the-loop timeout | Increase timeout or split into multiple workflows |
| Max steps reached | Complex page or AI retrying actions | Increase max_steps_per_run or simplify the goal |
When to adjust prompts vs file a bug
Adjust your prompt if:- The AI misunderstood what to do
- Completion criteria were ambiguous
- Visual descriptions were unclear
- Elements are visibly present but not detected
- Actions execute on wrong elements consistently
- Standard UI patterns (dropdowns, checkboxes) don’t work
Next steps
Error Handling
Map errors to custom codes for programmatic handling
Workflow Blocks Reference
Detailed documentation for validation and other blocks

