Skip to content

Session 1: Application Architecture for LLM-Powered Tools

Synopsis

Introduces the major components of a GenAI application, including user interface, orchestration logic, prompt layer, model layer, and output processing. Learners see how to structure Python projects beyond simple scripts.

Session Content

Session 1: Application Architecture for LLM-Powered Tools

Session Overview

Duration: ~45 minutes
Audience: Python developers with basic programming knowledge who are beginning to build GenAI-powered applications.

Learning Goals

By the end of this session, learners will be able to:

  • Explain the core architectural components of an LLM-powered application.
  • Distinguish between application logic, prompt construction, model interaction, and post-processing.
  • Understand the role of the OpenAI Responses API in a Python application.
  • Build a small but well-structured Python command-line tool powered by gpt-5.4-mini.
  • Apply basic architectural best practices such as separation of concerns, configuration management, and error handling.

1. Why Architecture Matters in LLM Applications

When developers first start building with LLMs, it is common to write code like this:

  • collect user input
  • send it directly to the model
  • print the response

That approach works for a demo, but it quickly becomes hard to maintain when you need:

  • reusable prompts
  • structured outputs
  • logging
  • retries
  • testing
  • safety checks
  • tool integration
  • multiple interaction steps

A good architecture helps separate responsibilities so the application can grow without becoming fragile.

Core Principle

An LLM should be treated as one component of your application, not the application itself.


2. High-Level Architecture of an LLM-Powered Tool

A simple LLM application usually contains the following layers:

2.1 User Interface Layer

This is how users interact with your app.

Examples:

  • command-line interface
  • web frontend
  • API endpoint
  • chat interface
  • internal company workflow

Responsibilities:

  • gather input
  • display output
  • validate obvious user mistakes

2.2 Application Logic Layer

This is the orchestration layer.

Responsibilities:

  • decide what the application should do
  • transform user input into model-ready requests
  • call the LLM service
  • optionally call tools or external services
  • combine and format results

This is often the most important layer for keeping your application clean.

2.3 Prompt Construction Layer

Prompts should not be scattered randomly through your code.

Responsibilities:

  • define system instructions
  • structure user input
  • enforce task framing
  • support consistency and reuse

Examples:

  • summarization prompt builder
  • classification prompt template
  • data extraction instruction set

2.4 Model Access Layer

This layer handles communication with the OpenAI API.

Responsibilities:

  • initialize the client
  • send requests
  • receive responses
  • handle API errors
  • manage model selection and request settings

2.5 Output Processing Layer

Model output is useful only when the rest of the app can consume it reliably.

Responsibilities:

  • extract text from the API response
  • parse structured content
  • validate results
  • format for display or storage

3. A Reference Flow for a Basic LLM Tool

Here is a common execution flow:

  1. User provides input.
  2. Application validates input.
  3. Prompt builder creates instructions and task context.
  4. Model service sends request to OpenAI Responses API.
  5. Application extracts the model output.
  6. Output is displayed or used in downstream logic.

Conceptual Diagram

User
  ↓
Interface Layer
  ↓
Application Logic
  ↓
Prompt Builder
  ↓
OpenAI Responses API
  ↓
Response Parser
  ↓
Application Output

4. The OpenAI Responses API in Python

For this course, we will use the OpenAI Python SDK with the Responses API.

Installation

pip install openai python-dotenv

Environment Variable

Set your API key:

export OPENAI_API_KEY="your_api_key_here"

On Windows PowerShell:

setx OPENAI_API_KEY "your_api_key_here"

Minimal Example

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-5.4-mini",
    input="Write a one-sentence explanation of what an LLM is."
)

print(response.output_text)

Example Output

An LLM, or large language model, is an AI system trained on vast amounts of text to understand and generate human-like language.

Why response.output_text Matters

The Responses API can return rich structured content. For many text-generation use cases, response.output_text is the simplest way to access the generated text.


5. Architectural Best Practices for Early Projects

5.1 Separate Configuration from Logic

Do not hardcode:

  • API keys
  • model names in many places
  • environment-specific settings

Instead, centralize them in one place.

5.2 Encapsulate Model Calls

Avoid calling the OpenAI client directly from every file or function. Create a small service class or function.

Benefits:

  • easier testing
  • consistent error handling
  • reusable defaults

5.3 Isolate Prompt Logic

Prompt-building deserves its own functions.

Bad:

response = client.responses.create(
    model="gpt-5.4-mini",
    input=f"You are an expert assistant. Summarize this: {user_text}"
)

Better:

def build_summary_prompt(text: str) -> str:
    return f"""You are a helpful assistant.
Summarize the following text in 3 bullet points:

{text}
"""

5.4 Validate and Sanitize Inputs

Even basic checks help:

  • empty input
  • extremely long input
  • wrong type
  • missing required fields

5.5 Post-Process Output

Do not assume the model always returns exactly what you expect.

Examples:

  • trim whitespace
  • enforce formatting
  • validate categories
  • retry or fallback when needed

6. Hands-On Exercise 1: Build a Minimal LLM-Powered CLI Tool

Goal

Create a command-line application that accepts a block of text and returns a concise summary.

What You Will Practice

  • using the OpenAI Python SDK
  • calling the Responses API
  • extracting output text
  • structuring code into reusable functions

Step 1: Create the Project File

Create a file named summarizer_cli.py.

Step 2: Add the Code

"""
A simple command-line summarizer powered by OpenAI's Responses API.

This script demonstrates a clean beginner-friendly architecture:
- separate prompt creation from model calling
- validate user input
- isolate API interaction
- keep the main program flow easy to read
"""

from openai import OpenAI


def build_summary_prompt(text: str) -> str:
    """
    Build a reusable prompt for summarization.

    Args:
        text: The source text to summarize.

    Returns:
        A formatted prompt string.
    """
    return f"""
You are a helpful assistant for Python developers learning GenAI.

Summarize the following text in exactly 3 bullet points.
Keep the language clear and beginner-friendly.

Text:
{text}
""".strip()


def get_summary(client: OpenAI, text: str) -> str:
    """
    Send a summarization request to the OpenAI Responses API.

    Args:
        client: An initialized OpenAI client.
        text: The source text to summarize.

    Returns:
        The generated summary text.
    """
    prompt = build_summary_prompt(text)

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=prompt,
    )

    return response.output_text.strip()


def main() -> None:
    """
    Main CLI entry point.
    """
    print("=== LLM Text Summarizer ===")
    user_text = input("Paste text to summarize:\n> ").strip()

    if not user_text:
        print("Error: input text cannot be empty.")
        return

    client = OpenAI()

    try:
        summary = get_summary(client, user_text)
        print("\nSummary:\n")
        print(summary)
    except Exception as exc:
        print(f"An error occurred while generating the summary: {exc}")


if __name__ == "__main__":
    main()

Step 3: Run the Script

python summarizer_cli.py

Example Input

Large language models can be used in many developer tools, such as code assistants, documentation helpers, and support chatbots. A good architecture separates user interaction, prompt creation, API communication, and output handling. This makes the application easier to test, debug, and extend.

Example Output

- Large language models can power tools like code assistants, documentation helpers, and chatbots.
- A well-designed application separates user interaction, prompt building, API calls, and output handling.
- Clear architecture makes LLM tools easier to test, debug, and improve over time.

7. Debrief: What Architecture Did We Just Use?

Even in this small script, we already separated responsibilities:

  • main() handles user interaction
  • build_summary_prompt() handles prompt construction
  • get_summary() handles model access
  • the model output is returned to the app for display

This is small, but it is real architecture.


8. Hands-On Exercise 2: Refactor into a More Maintainable Structure

Goal

Move from a single-file demo to a small multi-file application with clear responsibilities.

Suggested Project Structure

llm_tool/
├── app.py
├── config.py
├── prompts.py
└── llm_service.py

8.1 config.py

"""
Application configuration.

This module centralizes settings so they are not scattered across the codebase.
"""

MODEL_NAME = "gpt-5.4-mini"
APP_SUMMARY_BULLETS = 3

8.2 prompts.py

"""
Prompt construction helpers.

Prompt logic is isolated here for reuse and maintainability.
"""

from config import APP_SUMMARY_BULLETS


def build_summary_prompt(text: str) -> str:
    """
    Construct a prompt for summarizing text.

    Args:
        text: Source text to summarize.

    Returns:
        A prompt string.
    """
    return f"""
You are a helpful assistant for developers.

Summarize the following text in exactly {APP_SUMMARY_BULLETS} bullet points.
Keep the explanation concise, accurate, and easy to understand.

Text:
{text}
""".strip()

8.3 llm_service.py

"""
Model access layer for OpenAI Responses API.
"""

from openai import OpenAI

from config import MODEL_NAME


class LLMService:
    """
    A small service wrapper around the OpenAI client.

    This keeps API-related code in one place and makes future changes easier.
    """

    def __init__(self) -> None:
        self.client = OpenAI()

    def generate_text(self, prompt: str) -> str:
        """
        Generate text from the model using the given prompt.

        Args:
            prompt: The prompt to send to the model.

        Returns:
            The generated text.
        """
        response = self.client.responses.create(
            model=MODEL_NAME,
            input=prompt,
        )
        return response.output_text.strip()

8.4 app.py

"""
CLI application entry point.
"""

from llm_service import LLMService
from prompts import build_summary_prompt


def main() -> None:
    """
    Run the summarization app.
    """
    print("=== Modular LLM Summarizer ===")
    user_text = input("Enter text to summarize:\n> ").strip()

    if not user_text:
        print("Error: input text cannot be empty.")
        return

    prompt = build_summary_prompt(user_text)
    service = LLMService()

    try:
        result = service.generate_text(prompt)
        print("\nSummary:\n")
        print(result)
    except Exception as exc:
        print(f"Failed to generate summary: {exc}")


if __name__ == "__main__":
    main()

Run the App

From inside the llm_tool directory:

python app.py

Example Output

- LLM tools can be easier to maintain when prompts and API calls are kept separate.
- Centralized configuration helps avoid repeated hardcoded settings.
- A modular structure makes future expansion simpler.

9. Designing for Change: Questions to Ask Early

As your app grows, these architectural questions become important:

9.1 Will the app support multiple tasks?

Examples:

  • summarization
  • classification
  • extraction
  • rewriting

If yes, do not mix all prompt logic into one giant function.

9.2 Will the app need structured outputs?

If yes, create a parsing and validation layer.

9.3 Will the app call tools or external systems?

Examples:

  • databases
  • search APIs
  • file systems
  • business systems

Then your app needs orchestration beyond a single model call.

9.4 Will prompts evolve frequently?

If yes, make prompts easy to locate, edit, and test.

9.5 Will the app be used in production?

Then consider:

  • logging
  • retries
  • observability
  • rate limiting
  • user authentication
  • security reviews

10. Common Beginner Mistakes

Mistake 1: Mixing everything in one function

Problem: - hard to debug - hard to reuse - hard to test

Mistake 2: Hardcoding prompt text everywhere

Problem: - inconsistent behavior - prompt drift - maintenance pain

Mistake 3: Assuming model output is always perfectly formatted

Problem: - parsing errors - broken workflows - unreliable downstream logic

Mistake 4: Skipping input validation

Problem: - poor UX - avoidable failures - unnecessary API calls

Mistake 5: Treating the LLM as deterministic business logic

Problem: - model outputs can vary - critical logic needs validation and safeguards


11. Hands-On Exercise 3: Add Simple Output Validation

Goal

Improve reliability by validating the output before displaying it.

Scenario

We asked for exactly 3 bullet points. Let us add a lightweight validator.

File: validated_app.py

"""
A summarizer with simple output validation.

This example demonstrates post-processing and validation,
which are important parts of LLM application architecture.
"""

from openai import OpenAI


def build_summary_prompt(text: str) -> str:
    """
    Create a prompt requesting exactly 3 bullet points.
    """
    return f"""
You are a helpful assistant.

Summarize the following text in exactly 3 bullet points.
Each bullet must start with '- '.

Text:
{text}
""".strip()


def generate_summary(client: OpenAI, text: str) -> str:
    """
    Generate a summary using the OpenAI Responses API.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=build_summary_prompt(text),
    )
    return response.output_text.strip()


def validate_bullet_summary(summary: str, expected_bullets: int = 3) -> bool:
    """
    Validate that the summary has the expected number of bullet lines.

    Args:
        summary: Model-generated text.
        expected_bullets: Number of bullet points required.

    Returns:
        True if valid, otherwise False.
    """
    lines = [line.strip() for line in summary.splitlines() if line.strip()]
    bullet_lines = [line for line in lines if line.startswith("- ")]
    return len(bullet_lines) == expected_bullets


def main() -> None:
    """
    Run the validated summarizer.
    """
    client = OpenAI()

    user_text = input("Enter text to summarize:\n> ").strip()
    if not user_text:
        print("Error: input text cannot be empty.")
        return

    try:
        summary = generate_summary(client, user_text)

        if validate_bullet_summary(summary):
            print("\nValidated summary:\n")
            print(summary)
        else:
            print("\nWarning: The model output did not match the expected format.")
            print("Raw output:\n")
            print(summary)
    except Exception as exc:
        print(f"Request failed: {exc}")


if __name__ == "__main__":
    main()

Example Output

Validated summary:

- LLM app architecture benefits from clear separation of concerns.
- Prompt creation and response validation should be treated as explicit components.
- Even simple post-processing improves reliability in real applications.

12. Theory Recap: The Minimum Viable Architecture

For a first LLM-powered tool, a strong minimum architecture is:

Essential Components

  • UI/Input Layer: collects and validates user input
  • Prompt Builder: creates task-specific instructions
  • LLM Service: communicates with OpenAI
  • Output Processor: extracts and validates results
  • Configuration Module: stores reusable settings

Why This Works

It gives you:

  • clarity
  • maintainability
  • easier debugging
  • easier extension
  • a clean base for future agentic behavior

13. From LLM Apps to Agentic Apps

This course will later explore agentic development. That builds on today’s architecture.

A single-step LLM app usually does this:

  • input → prompt → model → output

An agentic app may do this:

  • input
  • reasoning/planning
  • choose an action
  • call a tool
  • inspect tool result
  • call model again
  • return final answer

That means the architecture becomes more orchestration-heavy. If your basic LLM app is already modular, it becomes much easier to evolve into an agentic system.


14. Suggested 45-Minute Timing

Part 1: Theory and Discussion (~20 min)

  • Why architecture matters
  • Main layers of an LLM-powered app
  • Responses API basics
  • Best practices and beginner pitfalls

Part 2: Hands-On Exercise 1 (~10 min)

  • Build a minimal summarizer CLI

Part 3: Hands-On Exercise 2 (~10 min)

  • Refactor into modular files

Part 4: Hands-On Exercise 3 + Wrap-Up (~5 min)

  • Add basic validation
  • Review key takeaways

15. Key Takeaways

  • LLMs should be one part of a broader application architecture.
  • Good architecture separates input handling, prompts, model access, and output processing.
  • The OpenAI Responses API provides a simple Python interface for model interaction.
  • Even small apps benefit from modular design.
  • Validation and post-processing improve reliability.
  • Clean architecture today makes agentic features easier tomorrow.

16. Useful Resources

  • OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
  • OpenAI API docs: https://platform.openai.com/docs
  • OpenAI Python SDK: https://github.com/openai/openai-python
  • Prompting guide: https://platform.openai.com/docs/guides/prompt-engineering
  • Python virtual environments: https://docs.python.org/3/tutorial/venv.html

17. Optional Homework

  1. Extend the summarizer to support:
  2. summary
  3. rewrite
  4. explain

  5. Add a menu that lets the user choose the task.

  6. Create one prompt builder function per task.

  7. Add a simple logger that prints:

  8. selected task
  9. input length
  10. whether output validation passed

  11. Reflect:

  12. Which parts of your app are UI?
  13. Which parts are orchestration?
  14. Which parts are model-facing?
  15. Which parts would need to change first if you added tools?

18. Session Wrap-Up

In this session, you learned the foundational architecture of an LLM-powered application. You used the OpenAI Responses API with gpt-5.4-mini, built a working Python CLI tool, and refactored it into a modular structure with clearer responsibilities.

This architectural mindset is the foundation for everything that follows in GenAI and agentic development.


Back to Chapter | Back to Master Plan | Next Session