Skip to content

Session 1: Deploying GenAI Applications

Synopsis

Introduces common deployment options such as web services, internal tools, async workers, and event-driven pipelines. Learners understand how architecture choices affect scalability, maintainability, and user experience.

Session Content

Session 1: Deploying GenAI Applications

Session Overview

Duration: ~45 minutes
Audience: Python developers with basic programming knowledge
Goal: Learn how to package, configure, and deploy a simple GenAI application using the OpenAI Python SDK and the Responses API.

Learning Objectives

By the end of this session, learners will be able to:

  • Understand the core components of a deployable GenAI application
  • Configure environment variables securely for API-based applications
  • Build a simple Python GenAI app using the OpenAI Responses API
  • Prepare the app for deployment with dependency management and configuration
  • Run the app locally in a production-like way
  • Apply basic deployment best practices for cloud or container environments

Agenda

  1. What deployment means for GenAI apps
  2. Core deployment architecture
  3. Project structure and environment management
  4. Building a deployable GenAI app with the Responses API
  5. Hands-on exercise: local deployment-ready app
  6. Deployment patterns and best practices
  7. Wrap-up and resources

1. What Deployment Means for GenAI Apps

Deployment is the process of taking an application from development on your machine to an environment where other users, systems, or services can reliably access it.

For GenAI applications, deployment includes more than just running Python code. It usually involves:

  • Application code
  • API credentials and configuration
  • Dependency management
  • Runtime environment
  • Logging and error handling
  • Scaling and monitoring considerations

Typical GenAI Deployment Flow

A deployed GenAI app often follows this path:

  1. A user or client sends a request
  2. Your Python application receives and validates the input
  3. Your app calls an LLM through an API
  4. The model returns a response
  5. Your app formats and returns the output
  6. Logs and metrics are captured for debugging and monitoring

Why Deployment Matters

A notebook demo is not the same as a deployable application. Real deployments require:

  • Repeatable setup
  • Secure key handling
  • Stable dependencies
  • Predictable input/output behavior
  • Error handling for API/network issues

2. Core Deployment Architecture

A minimal deployable GenAI application usually has the following parts:

Application Layer

This is your Python code that:

  • Accepts input
  • Constructs prompts
  • Calls the OpenAI API
  • Returns responses

Configuration Layer

This includes:

  • API keys
  • Model selection
  • Runtime settings
  • Environment-specific values

These should be stored in environment variables, not hard-coded in source files.

Dependency Layer

Your project should clearly define required packages, typically in:

  • requirements.txt
  • or pyproject.toml

Runtime Layer

This is where your code runs:

  • Local machine
  • Virtual machine
  • Container
  • Serverless platform
  • Managed app platform

3. Project Structure and Environment Management

A simple deployment-ready Python project can look like this:

genai-deploy-app/
├── app.py
├── requirements.txt
├── .env
├── .env.example
└── README.md

app.py

Main application code.

requirements.txt

Dependency list for reproducible installs.

.env

Local environment variables. Do not commit this file to version control.

.env.example

A template showing required environment variables without secrets.

Example:

OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-5.4-mini

Installing Dependencies

pip install openai python-dotenv

Example requirements.txt:

openai>=1.0.0
python-dotenv>=1.0.0

Loading Environment Variables

Using python-dotenv is convenient for local development.


4. Building a Deployable GenAI App with the Responses API

In this section, we build a small command-line app that summarizes user-provided text.

Application Requirements

The app should:

  • Load configuration from environment variables
  • Accept user input from the terminal
  • Call the OpenAI Responses API
  • Print the generated result
  • Handle common errors gracefully

5. Hands-On Exercise: Local Deployment-Ready App

Exercise Goal

Create and run a simple Python application that uses gpt-5.4-mini through the OpenAI Responses API.

Step 1: Create the Project

Create a folder named genai-deploy-app and add the following files.


requirements.txt

openai>=1.0.0
python-dotenv>=1.0.0

.env.example

OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-5.4-mini

.env

OPENAI_API_KEY=your_real_api_key_here
OPENAI_MODEL=gpt-5.4-mini

app.py

"""
A simple deployment-ready GenAI CLI application.

This script:
1. Loads configuration from environment variables
2. Accepts user input from the terminal
3. Calls the OpenAI Responses API
4. Prints a summarized result

Before running:
- Install dependencies: pip install -r requirements.txt
- Set your OPENAI_API_KEY in a .env file
"""

import os
import sys
from dotenv import load_dotenv
from openai import OpenAI


def load_config():
    """
    Load and validate required configuration from environment variables.
    Returns a tuple: (api_key, model)
    """
    load_dotenv()

    api_key = os.getenv("OPENAI_API_KEY")
    model = os.getenv("OPENAI_MODEL", "gpt-5.4-mini")

    if not api_key:
        raise ValueError(
            "Missing OPENAI_API_KEY. Add it to your environment or .env file."
        )

    return api_key, model


def get_user_text():
    """
    Get input text either from command-line arguments or interactive input.
    """
    if len(sys.argv) > 1:
        return " ".join(sys.argv[1:]).strip()

    print("Enter text to summarize:")
    return input("> ").strip()


def summarize_text(client: OpenAI, model: str, text: str) -> str:
    """
    Send text to the OpenAI Responses API and return the summary.
    """
    response = client.responses.create(
        model=model,
        input=[
            {
                "role": "system",
                "content": [
                    {
                        "type": "input_text",
                        "text": (
                            "You are a concise assistant. Summarize the user's text "
                            "in 3 bullet points."
                        ),
                    }
                ],
            },
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_text",
                        "text": text,
                    }
                ],
            },
        ],
    )

    return response.output_text


def main():
    """
    Main entry point for the application.
    """
    try:
        api_key, model = load_config()
        text = get_user_text()

        if not text:
            print("No input provided. Exiting.")
            return

        client = OpenAI(api_key=api_key)
        summary = summarize_text(client, model, text)

        print("\nSummary:")
        print(summary)

    except ValueError as config_error:
        print(f"Configuration error: {config_error}")
    except Exception as exc:
        print(f"Application error: {exc}")


if __name__ == "__main__":
    main()

Run the App

python app.py "OpenAI provides APIs for building intelligent applications. Deployment requires secure configuration, dependency management, and reliable runtime environments."

Example Output

Summary:
- OpenAI APIs enable developers to build intelligent applications.
- Deploying GenAI apps requires secure configuration management.
- Reliable dependency and runtime setup are important for production use.

Exercise Tasks

Task A

Run the app with command-line input.

Task B

Run the app without arguments and provide interactive input.

Task C

Change the system instruction to summarize in: - 1 sentence - 5 bullet points - a professional executive summary

Task D

Break the app intentionally by removing OPENAI_API_KEY from .env, then observe and explain the error.


6. Extending the App for Deployment

A deployable app should be easy to operate in different environments.

Improved Version with Structured Configuration and Logging

Below is a slightly more production-friendly version.

app.py

"""
Deployment-friendly GenAI application with:
- environment-based configuration
- basic logging
- clear error handling
- modular functions

Usage:
    python app.py "Explain why environment variables matter in deployment."
"""

import logging
import os
import sys
from dotenv import load_dotenv
from openai import OpenAI


# Configure basic logging. In production, you may send logs to a file or service.
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(message)s"
)
logger = logging.getLogger(__name__)


def load_config():
    """
    Load configuration from environment variables and validate required fields.
    """
    load_dotenv()

    config = {
        "api_key": os.getenv("OPENAI_API_KEY"),
        "model": os.getenv("OPENAI_MODEL", "gpt-5.4-mini"),
    }

    if not config["api_key"]:
        raise ValueError("OPENAI_API_KEY is not set.")

    return config


def build_client(api_key: str) -> OpenAI:
    """
    Create and return an OpenAI client instance.
    """
    return OpenAI(api_key=api_key)


def get_input_text() -> str:
    """
    Read text from command-line arguments or standard input.
    """
    if len(sys.argv) > 1:
        return " ".join(sys.argv[1:]).strip()

    logger.info("No command-line input detected; switching to interactive mode.")
    print("Enter text to summarize:")
    return input("> ").strip()


def generate_summary(client: OpenAI, model: str, text: str) -> str:
    """
    Generate a summary using the OpenAI Responses API.
    """
    response = client.responses.create(
        model=model,
        input=[
            {
                "role": "system",
                "content": [
                    {
                        "type": "input_text",
                        "text": (
                            "You are a helpful assistant. Summarize the provided text "
                            "clearly in 3 bullet points."
                        ),
                    }
                ],
            },
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_text",
                        "text": text,
                    }
                ],
            },
        ],
    )

    return response.output_text


def main():
    """
    Main application workflow.
    """
    try:
        config = load_config()
        logger.info("Configuration loaded successfully.")

        text = get_input_text()
        if not text:
            logger.warning("Empty input received. Exiting.")
            print("No text provided.")
            return

        client = build_client(config["api_key"])
        logger.info("OpenAI client initialized.")

        result = generate_summary(client, config["model"], text)
        logger.info("Summary generated successfully.")

        print("\nGenerated Summary:\n")
        print(result)

    except ValueError as exc:
        logger.error("Configuration validation failed: %s", exc)
        print(f"Configuration error: {exc}")
    except Exception as exc:
        logger.exception("Unexpected application failure.")
        print(f"Unexpected error: {exc}")


if __name__ == "__main__":
    main()

Example Output

2026-03-22 10:00:00,000 | INFO | Configuration loaded successfully.
2026-03-22 10:00:00,100 | INFO | OpenAI client initialized.
2026-03-22 10:00:01,200 | INFO | Summary generated successfully.

Generated Summary:

- Environment variables help keep API secrets out of source code.
- Stable dependencies make deployments reproducible.
- Logging and error handling improve production reliability.

7. Packaging for Deployment

Freeze Dependencies

To ensure reproducibility:

pip freeze > requirements.txt

Be aware that freezing everything may include packages you do not need. In professional projects, keep dependencies intentional and minimal.

Add a .gitignore

.env
__pycache__/
*.pyc
venv/

Basic README Checklist

Your README.md should include:

  • Project purpose
  • Setup instructions
  • Required environment variables
  • Run commands
  • Example usage

8. Deployment Patterns

Pattern 1: CLI Deployment

Useful for:

  • internal automation
  • scheduled scripts
  • batch processing

You run the Python script directly in a server or job environment.

Pattern 2: Web API Deployment

Useful when you want:

  • frontend integration
  • external consumers
  • service-oriented architecture

Typical stack:

  • FastAPI or Flask
  • OpenAI SDK
  • hosted on a cloud platform

Pattern 3: Container Deployment

Useful for consistency across environments.

A simple Dockerfile might look like this:

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python", "app.py"]

Notes

  • Do not bake secrets into the image
  • Pass environment variables at runtime
  • Use small base images where possible

9. Deployment Best Practices for GenAI Apps

1. Never Hard-Code Secrets

Bad:

client = OpenAI(api_key="my-secret-key")

Good:

import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

2. Validate Configuration Early

Fail fast when required settings are missing.

3. Handle API Errors Gracefully

Your app should not crash silently. Log errors and provide useful feedback.

4. Keep Prompts in Maintainable Code

As prompts become more important, consider storing them as constants or in dedicated modules.

5. Start Small, Then Add Monitoring

At minimum, capture:

  • request time
  • failures
  • input type
  • model used

Do not log sensitive user content unless explicitly required and permitted.

6. Make the Runtime Reproducible

Use:

  • virtual environments
  • pinned dependencies
  • containers when appropriate

7. Design for Change

Models, prompts, and configuration will evolve. Keep them configurable.


10. Mini Practice Challenge

Convert the summarizer into a deployment-ready "email drafting assistant."

Requirements

  • Accept a short prompt such as:
    "Write a polite follow-up email after a job interview."
  • Return:
  • subject line
  • greeting
  • body
  • closing

Starter Function

def draft_email(client, model, user_request: str) -> str:
    """
    Generate a professional email draft based on the user's request.
    """
    response = client.responses.create(
        model=model,
        input=[
            {
                "role": "system",
                "content": [
                    {
                        "type": "input_text",
                        "text": (
                            "You are a professional email assistant. "
                            "Write a clear email with a subject line, greeting, "
                            "body, and closing."
                        ),
                    }
                ],
            },
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_text",
                        "text": user_request,
                    }
                ],
            },
        ],
    )
    return response.output_text

Example Usage

client = OpenAI(api_key="YOUR_API_KEY")
result = draft_email(client, "gpt-5.4-mini", "Write a polite follow-up email after a job interview.")
print(result)

Example Output

Subject: Thank You for the Interview Opportunity

Dear [Interviewer Name],

Thank you for taking the time to meet with me during the interview. I appreciated the opportunity to learn more about the role and your team.

Our conversation reinforced my enthusiasm for the position, and I remain very interested in contributing to your organization.

Best regards,
[Your Name]

11. Recap

In this session, you learned how to:

  • Think about deployment as more than just running code
  • Structure a simple GenAI Python project
  • Use environment variables for secure configuration
  • Build a deployable app with the OpenAI Responses API
  • Improve reliability with logging and validation
  • Prepare an app for local, cloud, or container deployment

Useful Resources

  • OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
  • OpenAI API docs: https://platform.openai.com/docs
  • OpenAI Python SDK: https://github.com/openai/openai-python
  • python-dotenv documentation: https://pypi.org/project/python-dotenv/
  • Docker documentation: https://docs.docker.com/
  • FastAPI documentation: https://fastapi.tiangolo.com/

Suggested Instructor Flow

0–10 min

Introduce deployment concepts and architecture.

10–20 min

Walk through project structure, environment management, and configuration.

20–35 min

Build and run the summarizer app using the Responses API.

35–42 min

Discuss deployment patterns, Docker, and best practices.

42–45 min

Recap, Q&A, and mini challenge.


End-of-Session Checklist

  • [ ] I can explain what makes a GenAI app deployable
  • [ ] I can use environment variables for API credentials
  • [ ] I can call gpt-5.4-mini using the OpenAI Responses API in Python
  • [ ] I can create a basic deployment-ready project structure
  • [ ] I understand key deployment best practices for GenAI applications

Back to Chapter | Back to Master Plan | Next Session