How to Use Structured JSON Outputs With LLMs

By Sagnik Bhattacharya 11 Mar 2026 5 min read

Coding Liquids blog cover featuring Sagnik Bhattacharya for using structured JSON outputs with LLMs.

Getting an LLM to return valid, structured JSON is essential for building real applications. Without it, you are parsing free text and hoping for the best.

I teach Flutter and Excel with AI — explore my courses if you want structured learning.

Modern LLMs support structured outputs natively — you define a JSON schema, and the model's output is guaranteed to match it. This guide covers the practical patterns for using structured outputs effectively.

Quick answer

Define a JSON schema for the output you need, pass it to the model's structured output parameter, and the model returns data that matches your schema exactly. No parsing, no regex, no retry loops.

Your application needs to process model outputs programmatically.
You are building pipelines where model outputs feed into downstream systems.
You want to eliminate parsing failures and malformed responses.

Why structured outputs matter

Free-text responses from LLMs are fine for chat. They are terrible for applications. If your code needs to extract a list of items, a classification label, or a set of key-value pairs from the model's response, structured outputs are the reliable way to do it.

Without structured outputs, you end up writing fragile parsers, adding retry logic for malformed responses, and dealing with edge cases where the model wraps JSON in markdown code blocks or adds explanatory text.

How structured outputs work

You provide a JSON schema to the API call. The model generates tokens that are guaranteed to match that schema — the right keys, the right types, the right structure.

This is not the model 'trying harder' to format correctly. It is constrained generation — the model literally cannot produce tokens that would violate the schema.

Define your schema using standard JSON Schema syntax
Pass it in the API call (response_format for OpenAI, tool_use for Anthropic)
The response is guaranteed valid JSON matching your schema
Parse the response directly — no validation needed

Designing good schemas

Keep schemas simple. The more complex the schema, the more the model has to work to fill it correctly, and the more likely you are to get semantically wrong (even if structurally valid) responses.

Use clear field names that match what you are asking for. A field called 'category' with an enum of specific values is better than a field called 'type' with a free-text string.

# Example: structured output schema
schema = {
    "type": "object",
    "properties": {
        "summary": {"type": "string"},
        "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
        "key_topics": {
            "type": "array",
            "items": {"type": "string"}
        },
        "confidence": {"type": "number", "minimum": 0, "maximum": 1}
    },
    "required": ["summary", "sentiment", "key_topics"]
}

Provider-specific patterns

OpenAI uses `response_format` with `json_schema` type. Anthropic uses tool definitions with `input_schema`. Google uses `response_schema` in generation config. The concept is the same across providers — only the API shape differs.

For Anthropic specifically, you define a 'tool' whose input schema matches your desired output structure, then extract the tool call arguments as your structured data.

When to add validation anyway

Structured outputs guarantee structural correctness, not semantic correctness. The JSON will be valid and match your schema, but the values might be wrong, incomplete, or nonsensical.

Add validation for business-critical fields — check that numbers are in expected ranges, that required text fields are not empty strings, and that enum values make sense in context.

Worked example: document classification pipeline

You build a pipeline that classifies support tickets. Each ticket goes through an LLM with a structured output schema requiring a category (from a fixed list), priority (1-5), and a one-sentence summary. The structured output guarantees every ticket gets a valid classification that your downstream system can process without parsing errors.

Common mistakes

Using free-text responses and parsing them with regex when structured outputs are available.
Creating overly complex schemas that confuse the model semantically.
Assuming structural validity means semantic correctness — still validate business logic.

When to use something else

If you need the model to call external tools rather than just return structured data, see tool calling in AI apps. For evaluating whether the structured outputs are actually correct, see evaluating AI outputs.

Frequently asked questions

What are structured outputs?

You give the model a JSON schema and it returns data guaranteed to match it (right keys, types, and shape), so there is no parsing, regex, or retry loop in your code.

When should I use them over free text?

Whenever your code consumes the output, such as extracting a list, a classification label, or key-value pairs. Free text is fine for chat but unreliable for applications.

How do they actually work?

You pass a JSON schema to the API's structured-output parameter, and the model is constrained to emit tokens that conform to the schema, so the result parses every time.

Does valid JSON mean correct data?

No: it guarantees structure, not truth. The model can still put a wrong value in the right field, so validate semantically (ranges, enums, cross-field rules) after parsing.

How do I design a good schema?

Keep it simple and fairly flat. The more complex the schema, the harder the model works and the more likely you get structurally-valid-but-wrong values; use enums and field descriptions to constrain it.

What if I cannot use a native structured-output API?

Provide the schema in the prompt, ask for JSON only, and validate then retry on a parse failure. It is less reliable than native constrained decoding, so prefer the API where available.

Related guides on this site

These guides cover related patterns for building reliable AI applications.