By default, Skyvern returns extracted data in whatever format makes sense for the task.
Pass a data_extraction_schema to enforce a specific structure using JSON Schema.
Add data_extraction_schema parameter to your task with a JSON Schema object:
Copy
Ask AI
result = await client.run_task( prompt="Get the title of the top post", url="https://news.ycombinator.com", data_extraction_schema={ "type": "object", "properties": { "title": { "type": "string", "description": "The title of the top post" } } })
The description field in each property helps Skyvern understand what data to extract. Be specific.
description fields drive extraction quality. Vague descriptions like “the data” produce vague results. Be specific: “The product price in USD, without currency symbol.”
Extract multiple items with the same structure, such as the top posts from a news site:
Copy
Ask AI
result = await client.run_task( prompt="Get the top 5 posts", url="https://news.ycombinator.com", data_extraction_schema={ "type": "object", "properties": { "posts": { "type": "array", "description": "Top 5 posts from the front page", "items": { "type": "object", "properties": { "title": { "type": "string", "description": "Post title" }, "points": { "type": "integer", "description": "Number of points" }, "url": { "type": "string", "description": "Link to the post" } } } } } })
Output (when completed):
Copy
Ask AI
{ "posts": [ { "title": "Running Claude Code dangerously (safely)", "points": 342, "url": "https://blog.emilburzo.com/2026/01/running-claude-code-dangerously-safely/" }, { "title": "Linux kernel framework for PCIe device emulation", "points": 287, "url": "https://github.com/cakehonolulu/pciem" }, { "title": "I'm addicted to being useful", "points": 256, "url": "https://www.seangoedecke.com/addicted-to-being-useful/" }, { "title": "Level S4 solar radiation event", "points": 198, "url": "https://www.swpc.noaa.gov/news/g4-severe-geomagnetic-storm" }, { "title": "WebAssembly Text Format parser performance", "points": 176, "url": "https://blog.gplane.win/posts/improve-wat-parser-perf.html" } ]}
Arrays without limits extract everything visible on the page. Specify limits in your prompt (e.g., “top 5 posts”) or the array description to control output size.