Session 1: Application Architecture for LLM-Powered Tools

Synopsis

Introduces the major components of a GenAI application, including user interface, orchestration logic, prompt layer, model layer, and output processing. Learners see how to structure Python projects beyond simple scripts.

Session Content

Session 1: Application Architecture for LLM-Powered Tools

Session Overview

Duration: ~45 minutes
Audience: Python developers with basic programming knowledge who are beginning to build GenAI-powered applications.

Learning Goals

By the end of this session, learners will be able to:

Explain the core architectural components of an LLM-powered application.
Distinguish between application logic, prompt construction, model interaction, and post-processing.
Understand the role of the OpenAI Responses API in a Python application.
Build a small but well-structured Python command-line tool powered by gpt-5.4-mini.
Apply basic architectural best practices such as separation of concerns, configuration management, and error handling.

1. Why Architecture Matters in LLM Applications

When developers first start building with LLMs, it is common to write code like this:

collect user input
send it directly to the model
print the response

That approach works for a demo, but it quickly becomes hard to maintain when you need:

reusable prompts
structured outputs
logging
retries
testing
safety checks
tool integration
multiple interaction steps

A good architecture helps separate responsibilities so the application can grow without becoming fragile.

Core Principle

An LLM should be treated as one component of your application, not the application itself.

2. High-Level Architecture of an LLM-Powered Tool

A simple LLM application usually contains the following layers:

2.1 User Interface Layer

This is how users interact with your app.

Examples:

command-line interface
web frontend
API endpoint
chat interface
internal company workflow

Responsibilities:

gather input
display output
validate obvious user mistakes

2.2 Application Logic Layer

This is the orchestration layer.

Responsibilities:

decide what the application should do
transform user input into model-ready requests
call the LLM service
optionally call tools or external services
combine and format results

This is often the most important layer for keeping your application clean.

2.3 Prompt Construction Layer

Prompts should not be scattered randomly through your code.

Responsibilities:

define system instructions
structure user input
enforce task framing
support consistency and reuse

Examples:

summarization prompt builder
classification prompt template
data extraction instruction set

2.4 Model Access Layer

This layer handles communication with the OpenAI API.

Responsibilities:

initialize the client
send requests
receive responses
handle API errors
manage model selection and request settings

2.5 Output Processing Layer

Model output is useful only when the rest of the app can consume it reliably.

Responsibilities:

extract text from the API response
parse structured content
validate results
format for display or storage

3. A Reference Flow for a Basic LLM Tool

Here is a common execution flow:

User provides input.
Application validates input.
Prompt builder creates instructions and task context.
Model service sends request to OpenAI Responses API.
Application extracts the model output.
Output is displayed or used in downstream logic.

Conceptual Diagram

User
  ↓
Interface Layer
  ↓
Application Logic
  ↓
Prompt Builder
  ↓
OpenAI Responses API
  ↓
Response Parser
  ↓
Application Output

4. The OpenAI Responses API in Python

For this course, we will use the OpenAI Python SDK with the Responses API.

Installation

pip install openai python-dotenv

Environment Variable

Set your API key:

export OPENAI_API_KEY="your_api_key_here"

On Windows PowerShell:

setx OPENAI_API_KEY "your_api_key_here"

Minimal Example

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-5.4-mini",
    input="Write a one-sentence explanation of what an LLM is."
)

print(response.output_text)

Example Output

An LLM, or large language model, is an AI system trained on vast amounts of text to understand and generate human-like language.

Why `response.output_text` Matters

The Responses API can return rich structured content. For many text-generation use cases, response.output_text is the simplest way to access the generated text.

5. Architectural Best Practices for Early Projects

5.1 Separate Configuration from Logic

Do not hardcode:

API keys
model names in many places
environment-specific settings

Instead, centralize them in one place.

5.2 Encapsulate Model Calls

Avoid calling the OpenAI client directly from every file or function. Create a small service class or function.

Benefits:

easier testing
consistent error handling
reusable defaults

5.3 Isolate Prompt Logic

Prompt-building deserves its own functions.

Bad:

response = client.responses.create(
    model="gpt-5.4-mini",
    input=f"You are an expert assistant. Summarize this: {user_text}"
)

Better:

def build_summary_prompt(text: str) -> str:
    return f"""You are a helpful assistant.
Summarize the following text in 3 bullet points:

{text}
"""

5.4 Validate and Sanitize Inputs

Even basic checks help:

empty input
extremely long input
wrong type
missing required fields

5.5 Post-Process Output

Do not assume the model always returns exactly what you expect.

Examples:

trim whitespace
enforce formatting
validate categories
retry or fallback when needed

6. Hands-On Exercise 1: Build a Minimal LLM-Powered CLI Tool

Goal

Create a command-line application that accepts a block of text and returns a concise summary.

What You Will Practice

using the OpenAI Python SDK
calling the Responses API
extracting output text
structuring code into reusable functions

Step 1: Create the Project File

Create a file named summarizer_cli.py.

Step 2: Add the Code

"""
A simple command-line summarizer powered by OpenAI's Responses API.

This script demonstrates a clean beginner-friendly architecture:
- separate prompt creation from model calling
- validate user input
- isolate API interaction
- keep the main program flow easy to read
"""

from openai import OpenAI


def build_summary_prompt(text: str) -> str:
    """
    Build a reusable prompt for summarization.

    Args:
        text: The source text to summarize.

    Returns:
        A formatted prompt string.
    """
    return f"""
You are a helpful assistant for Python developers learning GenAI.

Summarize the following text in exactly 3 bullet points.
Keep the language clear and beginner-friendly.

Text:
{text}
""".strip()


def get_summary(client: OpenAI, text: str) -> str:
    """
    Send a summarization request to the OpenAI Responses API.

    Args:
        client: An initialized OpenAI client.
        text: The source text to summarize.

    Returns:
        The generated summary text.
    """
    prompt = build_summary_prompt(text)

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=prompt,
    )

    return response.output_text.strip()


def main() -> None:
    """
    Main CLI entry point.
    """
    print("=== LLM Text Summarizer ===")
    user_text = input("Paste text to summarize:\n> ").strip()

    if not user_text:
        print("Error: input text cannot be empty.")
        return

    client = OpenAI()

    try:
        summary = get_summary(client, user_text)
        print("\nSummary:\n")
        print(summary)
    except Exception as exc:
        print(f"An error occurred while generating the summary: {exc}")


if __name__ == "__main__":
    main()

Step 3: Run the Script

python summarizer_cli.py

Example Input

Large language models can be used in many developer tools, such as code assistants, documentation helpers, and support chatbots. A good architecture separates user interaction, prompt creation, API communication, and output handling. This makes the application easier to test, debug, and extend.

Example Output

- Large language models can power tools like code assistants, documentation helpers, and chatbots.
- A well-designed application separates user interaction, prompt building, API calls, and output handling.
- Clear architecture makes LLM tools easier to test, debug, and improve over time.

7. Debrief: What Architecture Did We Just Use?

Even in this small script, we already separated responsibilities:

main() handles user interaction
build_summary_prompt() handles prompt construction
get_summary() handles model access
the model output is returned to the app for display

This is small, but it is real architecture.

8. Hands-On Exercise 2: Refactor into a More Maintainable Structure

Goal

Move from a single-file demo to a small multi-file application with clear responsibilities.

Suggested Project Structure

llm_tool/
├── app.py
├── config.py
├── prompts.py
└── llm_service.py

8.1 `config.py`

"""
Application configuration.

This module centralizes settings so they are not scattered across the codebase.
"""

MODEL_NAME = "gpt-5.4-mini"
APP_SUMMARY_BULLETS = 3

8.2 `prompts.py`

"""
Prompt construction helpers.

Prompt logic is isolated here for reuse and maintainability.
"""

from config import APP_SUMMARY_BULLETS


def build_summary_prompt(text: str) -> str:
    """
    Construct a prompt for summarizing text.

    Args:
        text: Source text to summarize.

    Returns:
        A prompt string.
    """
    return f"""
You are a helpful assistant for developers.

Summarize the following text in exactly {APP_SUMMARY_BULLETS} bullet points.
Keep the explanation concise, accurate, and easy to understand.

Text:
{text}
""".strip()

8.3 `llm_service.py`

"""
Model access layer for OpenAI Responses API.
"""

from openai import OpenAI

from config import MODEL_NAME


class LLMService:
    """
    A small service wrapper around the OpenAI client.

    This keeps API-related code in one place and makes future changes easier.
    """

    def __init__(self) -> None:
        self.client = OpenAI()

    def generate_text(self, prompt: str) -> str:
        """
        Generate text from the model using the given prompt.

        Args:
            prompt: The prompt to send to the model.

        Returns:
            The generated text.
        """
        response = self.client.responses.create(
            model=MODEL_NAME,
            input=prompt,
        )
        return response.output_text.strip()

8.4 `app.py`

"""
CLI application entry point.
"""

from llm_service import LLMService
from prompts import build_summary_prompt


def main() -> None:
    """
    Run the summarization app.
    """
    print("=== Modular LLM Summarizer ===")
    user_text = input("Enter text to summarize:\n> ").strip()

    if not user_text:
        print("Error: input text cannot be empty.")
        return

    prompt = build_summary_prompt(user_text)
    service = LLMService()

    try:
        result = service.generate_text(prompt)
        print("\nSummary:\n")
        print(result)
    except Exception as exc:
        print(f"Failed to generate summary: {exc}")


if __name__ == "__main__":
    main()

Run the App

From inside the llm_tool directory:

python app.py

Example Output

- LLM tools can be easier to maintain when prompts and API calls are kept separate.
- Centralized configuration helps avoid repeated hardcoded settings.
- A modular structure makes future expansion simpler.

9. Designing for Change: Questions to Ask Early

As your app grows, these architectural questions become important:

9.1 Will the app support multiple tasks?

Examples:

summarization
classification
extraction
rewriting

If yes, do not mix all prompt logic into one giant function.

9.2 Will the app need structured outputs?

If yes, create a parsing and validation layer.

9.3 Will the app call tools or external systems?

Examples:

databases
search APIs
file systems
business systems

Then your app needs orchestration beyond a single model call.

9.4 Will prompts evolve frequently?

If yes, make prompts easy to locate, edit, and test.

9.5 Will the app be used in production?

Then consider:

logging
retries
observability
rate limiting
user authentication
security reviews

10. Common Beginner Mistakes

Mistake 1: Mixing everything in one function

Problem: - hard to debug - hard to reuse - hard to test

Mistake 2: Hardcoding prompt text everywhere

Problem: - inconsistent behavior - prompt drift - maintenance pain

Mistake 3: Assuming model output is always perfectly formatted

Problem: - parsing errors - broken workflows - unreliable downstream logic

Mistake 4: Skipping input validation

Problem: - poor UX - avoidable failures - unnecessary API calls

Mistake 5: Treating the LLM as deterministic business logic

Problem: - model outputs can vary - critical logic needs validation and safeguards

11. Hands-On Exercise 3: Add Simple Output Validation

Goal

Improve reliability by validating the output before displaying it.

Scenario

We asked for exactly 3 bullet points. Let us add a lightweight validator.

File: `validated_app.py`

"""
A summarizer with simple output validation.

This example demonstrates post-processing and validation,
which are important parts of LLM application architecture.
"""

from openai import OpenAI


def build_summary_prompt(text: str) -> str:
    """
    Create a prompt requesting exactly 3 bullet points.
    """
    return f"""
You are a helpful assistant.

Summarize the following text in exactly 3 bullet points.
Each bullet must start with '- '.

Text:
{text}
""".strip()


def generate_summary(client: OpenAI, text: str) -> str:
    """
    Generate a summary using the OpenAI Responses API.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=build_summary_prompt(text),
    )
    return response.output_text.strip()


def validate_bullet_summary(summary: str, expected_bullets: int = 3) -> bool:
    """
    Validate that the summary has the expected number of bullet lines.

    Args:
        summary: Model-generated text.
        expected_bullets: Number of bullet points required.

    Returns:
        True if valid, otherwise False.
    """
    lines = [line.strip() for line in summary.splitlines() if line.strip()]
    bullet_lines = [line for line in lines if line.startswith("- ")]
    return len(bullet_lines) == expected_bullets


def main() -> None:
    """
    Run the validated summarizer.
    """
    client = OpenAI()

    user_text = input("Enter text to summarize:\n> ").strip()
    if not user_text:
        print("Error: input text cannot be empty.")
        return

    try:
        summary = generate_summary(client, user_text)

        if validate_bullet_summary(summary):
            print("\nValidated summary:\n")
            print(summary)
        else:
            print("\nWarning: The model output did not match the expected format.")
            print("Raw output:\n")
            print(summary)
    except Exception as exc:
        print(f"Request failed: {exc}")


if __name__ == "__main__":
    main()

Example Output

Validated summary:

- LLM app architecture benefits from clear separation of concerns.
- Prompt creation and response validation should be treated as explicit components.
- Even simple post-processing improves reliability in real applications.

12. Theory Recap: The Minimum Viable Architecture

For a first LLM-powered tool, a strong minimum architecture is:

Essential Components

UI/Input Layer: collects and validates user input
Prompt Builder: creates task-specific instructions
LLM Service: communicates with OpenAI
Output Processor: extracts and validates results
Configuration Module: stores reusable settings

Why This Works

It gives you:

clarity
maintainability
easier debugging
easier extension
a clean base for future agentic behavior

13. From LLM Apps to Agentic Apps

This course will later explore agentic development. That builds on today’s architecture.

A single-step LLM app usually does this:

input → prompt → model → output

An agentic app may do this:

input
reasoning/planning
choose an action
call a tool
inspect tool result
call model again
return final answer

That means the architecture becomes more orchestration-heavy. If your basic LLM app is already modular, it becomes much easier to evolve into an agentic system.

14. Suggested 45-Minute Timing

Part 1: Theory and Discussion (~20 min)

Why architecture matters
Main layers of an LLM-powered app
Responses API basics
Best practices and beginner pitfalls

Part 2: Hands-On Exercise 1 (~10 min)

Build a minimal summarizer CLI

Part 3: Hands-On Exercise 2 (~10 min)

Refactor into modular files

Part 4: Hands-On Exercise 3 + Wrap-Up (~5 min)

Add basic validation
Review key takeaways

15. Key Takeaways

LLMs should be one part of a broader application architecture.
Good architecture separates input handling, prompts, model access, and output processing.
The OpenAI Responses API provides a simple Python interface for model interaction.
Even small apps benefit from modular design.
Validation and post-processing improve reliability.
Clean architecture today makes agentic features easier tomorrow.

16. Useful Resources

OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
OpenAI API docs: https://platform.openai.com/docs
OpenAI Python SDK: https://github.com/openai/openai-python
Prompting guide: https://platform.openai.com/docs/guides/prompt-engineering
Python virtual environments: https://docs.python.org/3/tutorial/venv.html

17. Optional Homework

Extend the summarizer to support:
summary
rewrite
explain
Add a menu that lets the user choose the task.
Create one prompt builder function per task.
Add a simple logger that prints:
selected task
input length
whether output validation passed
Reflect:
Which parts of your app are UI?
Which parts are orchestration?
Which parts are model-facing?
Which parts would need to change first if you added tools?

18. Session Wrap-Up

In this session, you learned the foundational architecture of an LLM-powered application. You used the OpenAI Responses API with gpt-5.4-mini, built a working Python CLI tool, and refactored it into a modular structure with clearer responsibilities.

This architectural mindset is the foundation for everything that follows in GenAI and agentic development.

Back to Chapter | Back to Master Plan | Next Session

Session 1: Application Architecture for LLM-Powered Tools

Synopsis

Session Content

Session 1: Application Architecture for LLM-Powered Tools

Session Overview

Learning Goals

1. Why Architecture Matters in LLM Applications

Core Principle

2. High-Level Architecture of an LLM-Powered Tool

2.1 User Interface Layer

2.2 Application Logic Layer

2.3 Prompt Construction Layer

2.4 Model Access Layer

2.5 Output Processing Layer

3. A Reference Flow for a Basic LLM Tool

Conceptual Diagram

4. The OpenAI Responses API in Python

Installation

Environment Variable

Minimal Example

Example Output

Why response.output_text Matters

5. Architectural Best Practices for Early Projects

5.1 Separate Configuration from Logic

5.2 Encapsulate Model Calls

5.3 Isolate Prompt Logic

5.4 Validate and Sanitize Inputs

5.5 Post-Process Output

6. Hands-On Exercise 1: Build a Minimal LLM-Powered CLI Tool

Goal

What You Will Practice

Step 1: Create the Project File

Step 2: Add the Code

Step 3: Run the Script

Example Input

Example Output

7. Debrief: What Architecture Did We Just Use?

8. Hands-On Exercise 2: Refactor into a More Maintainable Structure

Goal

Suggested Project Structure

8.1 config.py

8.2 prompts.py

8.3 llm_service.py

8.4 app.py

Run the App

Example Output

9. Designing for Change: Questions to Ask Early

9.1 Will the app support multiple tasks?

9.2 Will the app need structured outputs?

9.3 Will the app call tools or external systems?

9.4 Will prompts evolve frequently?

9.5 Will the app be used in production?

10. Common Beginner Mistakes

Mistake 1: Mixing everything in one function

Mistake 2: Hardcoding prompt text everywhere

Mistake 3: Assuming model output is always perfectly formatted

Mistake 4: Skipping input validation

Mistake 5: Treating the LLM as deterministic business logic

11. Hands-On Exercise 3: Add Simple Output Validation

Goal

Scenario

File: validated_app.py

Example Output

12. Theory Recap: The Minimum Viable Architecture

Essential Components

Why This Works

13. From LLM Apps to Agentic Apps

14. Suggested 45-Minute Timing

Part 1: Theory and Discussion (~20 min)

Part 2: Hands-On Exercise 1 (~10 min)

Part 3: Hands-On Exercise 2 (~10 min)

Part 4: Hands-On Exercise 3 + Wrap-Up (~5 min)

15. Key Takeaways

16. Useful Resources

17. Optional Homework

18. Session Wrap-Up

Why `response.output_text` Matters

8.1 `config.py`

8.2 `prompts.py`

8.3 `llm_service.py`

8.4 `app.py`

File: `validated_app.py`