Session 3: Designing Structured Outputs
Synopsis
Shows how to request predictable responses such as JSON, tabular data, or schema-aligned outputs. Learners begin designing machine-readable LLM responses suitable for downstream automation and agent workflows.
Session Content
Session 3: Designing Structured Outputs
Session Overview
In this session, learners will move from “free-form text generation” to reliable, machine-usable outputs. This is a foundational skill for building production-grade GenAI systems and agentic workflows. Instead of asking a model to “just answer,” we will design prompts and response formats so the model returns data that can be parsed, validated, and used directly in Python applications.
By the end of this session, learners will be able to:
- Explain why structured outputs matter in real applications
- Design prompts that encourage predictable JSON responses
- Use the OpenAI Responses API with
gpt-5.4-minito generate structured data - Validate and parse model outputs safely in Python
- Build small workflows that depend on structured model responses
- Recognize common failure modes and improve prompt reliability
Learning Objectives
After this session, you should be able to:
- Define structured outputs and explain their value in software systems
- Compare unstructured natural language responses vs. schema-like JSON responses
- Write prompts that specify output shape, constraints, and field expectations
- Parse model-generated JSON safely in Python
- Validate structured outputs with lightweight Python checks
- Design robust fallback strategies when model output is malformed or incomplete
Session Agenda (~45 minutes)
- 0–5 min: Introduction to structured outputs
- 5–15 min: Theory: why structure matters and common design patterns
- 15–25 min: Prompt design for structured JSON outputs
- 25–35 min: Hands-on Exercise 1: Generate structured product summaries
- 35–42 min: Hands-on Exercise 2: Build a simple support ticket classifier
- 42–45 min: Wrap-up and recap
1. Why Structured Outputs Matter
Large language models are very good at generating human-readable text. But software systems often need responses in a format that Python code can consume automatically.
Example: Unstructured output
If you ask:
“Summarize this customer review and tell me the sentiment.”
You might get:
“The customer liked the battery life but was disappointed by the slow charging speed. Overall sentiment is mixed.”
This is readable, but hard to process programmatically.
Example: Structured output
A better response for an application might be:
{
"summary": "Customer liked battery life but disliked slow charging.",
"sentiment": "mixed",
"priority": "medium"
}
This can be:
- parsed into a Python dictionary
- stored in a database
- used in a dashboard
- fed into another system or agent
Benefits of structured outputs
Structured outputs enable:
- Reliability: predictable fields and formats
- Automation: direct use in downstream code
- Validation: check presence and types of fields
- Interoperability: integrate with APIs, UIs, and workflows
- Agentic orchestration: one model step can produce inputs for the next
2. Common Structured Output Patterns
There are several useful output patterns when working with LLMs.
2.1 Flat JSON objects
Best for small tasks like classification, extraction, scoring, and summaries.
Example:
{
"category": "billing",
"urgency": "high",
"requires_human": true
}
2.2 Lists of objects
Useful when extracting multiple items from text.
Example:
{
"action_items": [
{
"task": "Email client",
"owner": "Ava",
"deadline": "2026-03-25"
},
{
"task": "Prepare budget draft",
"owner": "Ravi",
"deadline": "2026-03-27"
}
]
}
2.3 Nested JSON
Useful for richer responses, but should be used carefully. The more complex the output, the more likely formatting issues become.
Example:
{
"document": {
"title": "Q1 Planning Notes",
"summary": "Budget and hiring priorities were discussed."
},
"risks": [
"Hiring delays",
"Vendor cost increases"
]
}
2.4 Enumerated values
If a field should come from a limited set, explicitly state allowed values.
Example:
- sentiment:
positive,negative,mixed,neutral - urgency:
low,medium,high
This improves consistency and simplifies validation.
3. Principles for Designing Good Structured Prompts
When asking for structured outputs, vague prompts produce vague data. Good prompts clearly define:
- the expected format
- the required fields
- field meanings
- allowed values
- rules for missing information
3.1 Be explicit about the format
Bad:
“Return the result in a structured way.”
Better:
“Return valid JSON with exactly these keys:
summary,sentiment,confidence.”
3.2 Define constraints
Example:
sentimentmust be one of:positive,negative,mixed,neutralconfidencemust be a float from 0 to 1summarymust be at most 30 words
3.3 Tell the model what to do when information is missing
Example:
“If the text does not contain enough information, use
nullfor missing values.”
This is much better than allowing hallucinated values.
3.4 Avoid unnecessary complexity
Start with the smallest useful schema. Overly complex nested structures are harder to generate and validate.
3.5 Ask for JSON only
Example:
“Return only valid JSON. Do not include markdown, explanation, or surrounding text.”
This reduces parsing problems.
4. Example Prompt Template for Structured Outputs
A reusable template:
You are a data extraction assistant.
Analyze the input text and return only valid JSON.
Required schema:
{
"field_1": "string",
"field_2": "string",
"field_3": "number | null"
}
Rules:
- Return only valid JSON
- Do not include markdown or commentary
- If a value is unknown, use null
- field_2 must be one of: "low", "medium", "high"
- field_1 should be concise and under 20 words
This kind of prompt works well for many extraction and classification tasks.
5. Using the OpenAI Responses API for Structured Output
We will use:
- Python
- OpenAI Python SDK
gpt-5.4-mini- Responses API
5.1 Install dependencies
pip install openai
5.2 Set your API key
export OPENAI_API_KEY="your_api_key_here"
On Windows PowerShell:
$env:OPENAI_API_KEY="your_api_key_here"
6. Hands-on Exercise 1: Product Review Summarizer with JSON Output
Goal
Convert free-form customer reviews into structured JSON containing:
- short summary
- sentiment
- confidence
- key issues list
What you will learn
- How to write a structured-output prompt
- How to call the Responses API
- How to parse JSON safely
- How to validate basic fields
Step 1: Create the Python script
import json
import os
from openai import OpenAI
# Initialize the OpenAI client using the API key from the environment.
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Example product review text to analyze.
review_text = """
I really like how long the battery lasts on this headset.
The sound quality is clear and comfortable for long meetings.
However, charging is much slower than I expected, and the case feels a bit flimsy.
"""
# Prompt instructing the model to return structured JSON only.
prompt = f"""
You are a product review analysis assistant.
Analyze the following customer review and return only valid JSON.
Required schema:
{{
"summary": "string",
"sentiment": "positive | negative | mixed | neutral",
"confidence": "number between 0 and 1",
"key_issues": ["string"]
}}
Rules:
- Return only valid JSON
- Do not include markdown or explanation
- summary must be at most 25 words
- sentiment must be one of: positive, negative, mixed, neutral
- confidence must be a number between 0 and 1
- key_issues should be an array of short strings
- If there are no issues, return an empty array
Customer review:
\"\"\"
{review_text}
\"\"\"
"""
# Send the request using the Responses API.
response = client.responses.create(
model="gpt-5.4-mini",
input=prompt
)
# The output_text property provides the model's text response.
raw_output = response.output_text
print("Raw model output:")
print(raw_output)
# Parse the JSON response safely.
try:
parsed = json.loads(raw_output)
except json.JSONDecodeError as exc:
print("\nFailed to parse JSON:", exc)
raise
# Lightweight validation checks to ensure expected structure.
required_keys = {"summary", "sentiment", "confidence", "key_issues"}
missing_keys = required_keys - parsed.keys()
if missing_keys:
raise ValueError(f"Missing required keys: {missing_keys}")
if parsed["sentiment"] not in {"positive", "negative", "mixed", "neutral"}:
raise ValueError("Invalid sentiment value")
if not isinstance(parsed["confidence"], (int, float)):
raise ValueError("Confidence must be numeric")
if not (0 <= parsed["confidence"] <= 1):
raise ValueError("Confidence must be between 0 and 1")
if not isinstance(parsed["key_issues"], list):
raise ValueError("key_issues must be a list")
print("\nValidated structured output:")
print(json.dumps(parsed, indent=2))
Example output
{
"summary": "Customer praised battery and comfort but criticized slow charging and case quality.",
"sentiment": "mixed",
"confidence": 0.94,
"key_issues": [
"slow charging",
"flimsy case"
]
}
Discussion
Notice how this output is much more useful than plain prose:
summarycan be shown in a UIsentimentcan power analyticsconfidencecan support filtering or escalationkey_issuescan be aggregated across many reviews
Mini Challenge
Modify the script to process three reviews in a loop and collect all parsed results into a Python list.
Suggested review samples:
reviews = [
"The laptop is fast and lightweight, but the fan gets loud under load.",
"Excellent camera quality and battery. Very happy with this phone.",
"The app crashes often and syncing is unreliable. Frustrating experience."
]
7. Designing Safer Parsers and Validators
Even with a strong prompt, model output may occasionally be malformed or incomplete. In production, always validate.
Recommended validation steps
- Parse with
json.loads - Check required keys
- Check data types
- Check allowed enum values
- Check length or range constraints
- Handle failure gracefully
Common fallback strategies
- retry with a stricter prompt
- log invalid output for review
- return a safe default
- ask the model to fix invalid JSON in a second pass
8. Hands-on Exercise 2: Support Ticket Classifier
Goal
Build a small classifier that converts incoming support text into structured fields:
- category
- urgency
- requires_human
- short_reason
This is a common real-world GenAI pattern.
Step 1: Create the classifier script
import json
import os
from openai import OpenAI
# Initialize the client.
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Example support ticket submitted by a user.
ticket_text = """
I was charged twice for my subscription this month.
I already contacted support yesterday and have not received a refund.
Please resolve this as soon as possible.
"""
# Prompt specifying an exact schema and constraints.
prompt = f"""
You are a support ticket triage assistant.
Read the ticket and return only valid JSON.
Required schema:
{{
"category": "billing | technical | account | shipping | other",
"urgency": "low | medium | high",
"requires_human": true,
"short_reason": "string"
}}
Rules:
- Return only valid JSON
- Do not include markdown or explanation
- category must be one of: billing, technical, account, shipping, other
- urgency must be one of: low, medium, high
- requires_human must be a boolean
- short_reason must be at most 20 words
Ticket:
\"\"\"
{ticket_text}
\"\"\"
"""
# Call the Responses API.
response = client.responses.create(
model="gpt-5.4-mini",
input=prompt
)
raw_output = response.output_text
print("Raw model output:")
print(raw_output)
# Parse and validate the JSON output.
parsed = json.loads(raw_output)
allowed_categories = {"billing", "technical", "account", "shipping", "other"}
allowed_urgency = {"low", "medium", "high"}
if parsed["category"] not in allowed_categories:
raise ValueError("Invalid category")
if parsed["urgency"] not in allowed_urgency:
raise ValueError("Invalid urgency")
if not isinstance(parsed["requires_human"], bool):
raise ValueError("requires_human must be boolean")
if not isinstance(parsed["short_reason"], str):
raise ValueError("short_reason must be a string")
print("\nValidated ticket classification:")
print(json.dumps(parsed, indent=2))
Example output
{
"category": "billing",
"urgency": "high",
"requires_human": true,
"short_reason": "Duplicate charge and refund request need human follow-up."
}
Step 2: Extend to multiple tickets
You can process a batch of tickets in a loop.
import json
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
tickets = [
"I can't log into my account even after resetting my password twice.",
"My order was supposed to arrive last week and still hasn't been delivered.",
"The mobile app freezes whenever I try to upload a profile picture."
]
for i, ticket in enumerate(tickets, start=1):
prompt = f"""
You are a support ticket triage assistant.
Read the ticket and return only valid JSON.
Required schema:
{{
"category": "billing | technical | account | shipping | other",
"urgency": "low | medium | high",
"requires_human": true,
"short_reason": "string"
}}
Rules:
- Return only valid JSON
- Do not include markdown or explanation
- category must be one of: billing, technical, account, shipping, other
- urgency must be one of: low, medium, high
- requires_human must be a boolean
- short_reason must be at most 20 words
Ticket:
\"\"\"
{ticket}
\"\"\"
"""
response = client.responses.create(
model="gpt-5.4-mini",
input=prompt
)
raw_output = response.output_text
parsed = json.loads(raw_output)
print(f"\nTicket #{i}")
print(json.dumps(parsed, indent=2))
Example output
Ticket #1
{
"category": "account",
"urgency": "high",
"requires_human": true,
"short_reason": "User cannot access account after password resets."
}
Ticket #2
{
"category": "shipping",
"urgency": "medium",
"requires_human": true,
"short_reason": "Order delivery is overdue."
}
Ticket #3
{
"category": "technical",
"urgency": "medium",
"requires_human": true,
"short_reason": "App freezes during profile image upload."
}
9. Prompt Design Checklist for Structured Outputs
Use this checklist whenever you need machine-readable results.
Checklist
- Specify the output format explicitly
- Ask for JSON only
- Define required keys
- Define allowed values for enum-like fields
- Define numeric ranges when needed
- Specify behavior for missing data
- Keep schemas as simple as possible
- Validate in Python before using the result downstream
10. Common Mistakes
Mistake 1: Asking for “structured data” without defining structure
Too vague:
“Give me the result in a structured format.”
Better:
“Return valid JSON with keys
category,urgency, andreason.”
Mistake 2: Not validating output
Even if the model usually behaves well, your application should not assume correctness.
Mistake 3: Overly large schemas
If you ask for 20 nested fields, reliability may decrease. Start small.
Mistake 4: No handling for missing information
Always define whether the model should use null, empty strings, or empty arrays.
Mistake 5: Mixing explanation with data
If you want parseable JSON, ask for JSON only.
11. From Structured Outputs to Agentic Workflows
Structured outputs are a key building block of agentic systems.
For example:
- Step 1: classify user input into a category
- Step 2: decide whether to use a tool
- Step 3: extract parameters for the tool
- Step 4: generate the final user-facing response
Each step depends on predictable outputs from the previous step. Without structure, chaining these steps becomes fragile.
Example workflow
A customer message:
“My package is late and I need it before Friday.”
Structured output from the model:
{
"category": "shipping",
"urgency": "high",
"requires_tracking_lookup": true,
"customer_deadline": "Friday"
}
Your Python code can now:
- route to shipping support
- trigger a tracking API lookup
- prioritize the request
This is why structured outputs are central to practical AI systems.
12. Wrap-Up
In this session, you learned how to:
- design prompts for structured outputs
- request JSON-only responses
- parse model output in Python
- validate fields and constraints
- use structured outputs in simple application workflows
This is one of the most important patterns in applied GenAI development. Once outputs become predictable, they can power reliable software systems rather than just chatbot-style interactions.
13. Practice Tasks
Try these after the session:
- Build a meeting note extractor that returns:
- summary
- decisions
- action items
-
owners
-
Build a resume screener that returns:
- candidate_name
- years_experience
- key_skills
-
fit_score
-
Build a content moderation helper that returns:
- risk_level
- categories
- recommended_action
For each project: - define a schema - prompt for JSON only - parse and validate in Python - test with multiple examples
Useful Resources
- OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
- OpenAI API docs: https://platform.openai.com/docs
- OpenAI Python SDK: https://github.com/openai/openai-python
- Python
jsonmodule documentation: https://docs.python.org/3/library/json.html
Quick Recap
- Structured outputs make LLM responses usable by software
- JSON is a practical default format
- Good prompts specify schema, rules, and constraints
- Python validation is essential
- Structured outputs are a foundation for agentic applications
End of Session
Back to Chapter | Back to Master Plan | Previous Session | Next Session