Session 1: Deploying GenAI Applications
Synopsis
Introduces common deployment options such as web services, internal tools, async workers, and event-driven pipelines. Learners understand how architecture choices affect scalability, maintainability, and user experience.
Session Content
Session 1: Deploying GenAI Applications
Session Overview
Duration: ~45 minutes
Audience: Python developers with basic programming knowledge
Goal: Learn how to package, configure, and deploy a simple GenAI application using the OpenAI Python SDK and the Responses API.
Learning Objectives
By the end of this session, learners will be able to:
- Understand the core components of a deployable GenAI application
- Configure environment variables securely for API-based applications
- Build a simple Python GenAI app using the OpenAI Responses API
- Prepare the app for deployment with dependency management and configuration
- Run the app locally in a production-like way
- Apply basic deployment best practices for cloud or container environments
Agenda
- What deployment means for GenAI apps
- Core deployment architecture
- Project structure and environment management
- Building a deployable GenAI app with the Responses API
- Hands-on exercise: local deployment-ready app
- Deployment patterns and best practices
- Wrap-up and resources
1. What Deployment Means for GenAI Apps
Deployment is the process of taking an application from development on your machine to an environment where other users, systems, or services can reliably access it.
For GenAI applications, deployment includes more than just running Python code. It usually involves:
- Application code
- API credentials and configuration
- Dependency management
- Runtime environment
- Logging and error handling
- Scaling and monitoring considerations
Typical GenAI Deployment Flow
A deployed GenAI app often follows this path:
- A user or client sends a request
- Your Python application receives and validates the input
- Your app calls an LLM through an API
- The model returns a response
- Your app formats and returns the output
- Logs and metrics are captured for debugging and monitoring
Why Deployment Matters
A notebook demo is not the same as a deployable application. Real deployments require:
- Repeatable setup
- Secure key handling
- Stable dependencies
- Predictable input/output behavior
- Error handling for API/network issues
2. Core Deployment Architecture
A minimal deployable GenAI application usually has the following parts:
Application Layer
This is your Python code that:
- Accepts input
- Constructs prompts
- Calls the OpenAI API
- Returns responses
Configuration Layer
This includes:
- API keys
- Model selection
- Runtime settings
- Environment-specific values
These should be stored in environment variables, not hard-coded in source files.
Dependency Layer
Your project should clearly define required packages, typically in:
requirements.txt- or
pyproject.toml
Runtime Layer
This is where your code runs:
- Local machine
- Virtual machine
- Container
- Serverless platform
- Managed app platform
3. Project Structure and Environment Management
A simple deployment-ready Python project can look like this:
genai-deploy-app/
├── app.py
├── requirements.txt
├── .env
├── .env.example
└── README.md
Recommended Files
app.py
Main application code.
requirements.txt
Dependency list for reproducible installs.
.env
Local environment variables. Do not commit this file to version control.
.env.example
A template showing required environment variables without secrets.
Example:
OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-5.4-mini
Installing Dependencies
pip install openai python-dotenv
Example requirements.txt:
openai>=1.0.0
python-dotenv>=1.0.0
Loading Environment Variables
Using python-dotenv is convenient for local development.
4. Building a Deployable GenAI App with the Responses API
In this section, we build a small command-line app that summarizes user-provided text.
Application Requirements
The app should:
- Load configuration from environment variables
- Accept user input from the terminal
- Call the OpenAI Responses API
- Print the generated result
- Handle common errors gracefully
5. Hands-On Exercise: Local Deployment-Ready App
Exercise Goal
Create and run a simple Python application that uses gpt-5.4-mini through the OpenAI Responses API.
Step 1: Create the Project
Create a folder named genai-deploy-app and add the following files.
requirements.txt
openai>=1.0.0
python-dotenv>=1.0.0
.env.example
OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-5.4-mini
.env
OPENAI_API_KEY=your_real_api_key_here
OPENAI_MODEL=gpt-5.4-mini
app.py
"""
A simple deployment-ready GenAI CLI application.
This script:
1. Loads configuration from environment variables
2. Accepts user input from the terminal
3. Calls the OpenAI Responses API
4. Prints a summarized result
Before running:
- Install dependencies: pip install -r requirements.txt
- Set your OPENAI_API_KEY in a .env file
"""
import os
import sys
from dotenv import load_dotenv
from openai import OpenAI
def load_config():
"""
Load and validate required configuration from environment variables.
Returns a tuple: (api_key, model)
"""
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
model = os.getenv("OPENAI_MODEL", "gpt-5.4-mini")
if not api_key:
raise ValueError(
"Missing OPENAI_API_KEY. Add it to your environment or .env file."
)
return api_key, model
def get_user_text():
"""
Get input text either from command-line arguments or interactive input.
"""
if len(sys.argv) > 1:
return " ".join(sys.argv[1:]).strip()
print("Enter text to summarize:")
return input("> ").strip()
def summarize_text(client: OpenAI, model: str, text: str) -> str:
"""
Send text to the OpenAI Responses API and return the summary.
"""
response = client.responses.create(
model=model,
input=[
{
"role": "system",
"content": [
{
"type": "input_text",
"text": (
"You are a concise assistant. Summarize the user's text "
"in 3 bullet points."
),
}
],
},
{
"role": "user",
"content": [
{
"type": "input_text",
"text": text,
}
],
},
],
)
return response.output_text
def main():
"""
Main entry point for the application.
"""
try:
api_key, model = load_config()
text = get_user_text()
if not text:
print("No input provided. Exiting.")
return
client = OpenAI(api_key=api_key)
summary = summarize_text(client, model, text)
print("\nSummary:")
print(summary)
except ValueError as config_error:
print(f"Configuration error: {config_error}")
except Exception as exc:
print(f"Application error: {exc}")
if __name__ == "__main__":
main()
Run the App
python app.py "OpenAI provides APIs for building intelligent applications. Deployment requires secure configuration, dependency management, and reliable runtime environments."
Example Output
Summary:
- OpenAI APIs enable developers to build intelligent applications.
- Deploying GenAI apps requires secure configuration management.
- Reliable dependency and runtime setup are important for production use.
Exercise Tasks
Task A
Run the app with command-line input.
Task B
Run the app without arguments and provide interactive input.
Task C
Change the system instruction to summarize in: - 1 sentence - 5 bullet points - a professional executive summary
Task D
Break the app intentionally by removing OPENAI_API_KEY from .env, then observe and explain the error.
6. Extending the App for Deployment
A deployable app should be easy to operate in different environments.
Improved Version with Structured Configuration and Logging
Below is a slightly more production-friendly version.
app.py
"""
Deployment-friendly GenAI application with:
- environment-based configuration
- basic logging
- clear error handling
- modular functions
Usage:
python app.py "Explain why environment variables matter in deployment."
"""
import logging
import os
import sys
from dotenv import load_dotenv
from openai import OpenAI
# Configure basic logging. In production, you may send logs to a file or service.
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s | %(levelname)s | %(message)s"
)
logger = logging.getLogger(__name__)
def load_config():
"""
Load configuration from environment variables and validate required fields.
"""
load_dotenv()
config = {
"api_key": os.getenv("OPENAI_API_KEY"),
"model": os.getenv("OPENAI_MODEL", "gpt-5.4-mini"),
}
if not config["api_key"]:
raise ValueError("OPENAI_API_KEY is not set.")
return config
def build_client(api_key: str) -> OpenAI:
"""
Create and return an OpenAI client instance.
"""
return OpenAI(api_key=api_key)
def get_input_text() -> str:
"""
Read text from command-line arguments or standard input.
"""
if len(sys.argv) > 1:
return " ".join(sys.argv[1:]).strip()
logger.info("No command-line input detected; switching to interactive mode.")
print("Enter text to summarize:")
return input("> ").strip()
def generate_summary(client: OpenAI, model: str, text: str) -> str:
"""
Generate a summary using the OpenAI Responses API.
"""
response = client.responses.create(
model=model,
input=[
{
"role": "system",
"content": [
{
"type": "input_text",
"text": (
"You are a helpful assistant. Summarize the provided text "
"clearly in 3 bullet points."
),
}
],
},
{
"role": "user",
"content": [
{
"type": "input_text",
"text": text,
}
],
},
],
)
return response.output_text
def main():
"""
Main application workflow.
"""
try:
config = load_config()
logger.info("Configuration loaded successfully.")
text = get_input_text()
if not text:
logger.warning("Empty input received. Exiting.")
print("No text provided.")
return
client = build_client(config["api_key"])
logger.info("OpenAI client initialized.")
result = generate_summary(client, config["model"], text)
logger.info("Summary generated successfully.")
print("\nGenerated Summary:\n")
print(result)
except ValueError as exc:
logger.error("Configuration validation failed: %s", exc)
print(f"Configuration error: {exc}")
except Exception as exc:
logger.exception("Unexpected application failure.")
print(f"Unexpected error: {exc}")
if __name__ == "__main__":
main()
Example Output
2026-03-22 10:00:00,000 | INFO | Configuration loaded successfully.
2026-03-22 10:00:00,100 | INFO | OpenAI client initialized.
2026-03-22 10:00:01,200 | INFO | Summary generated successfully.
Generated Summary:
- Environment variables help keep API secrets out of source code.
- Stable dependencies make deployments reproducible.
- Logging and error handling improve production reliability.
7. Packaging for Deployment
Freeze Dependencies
To ensure reproducibility:
pip freeze > requirements.txt
Be aware that freezing everything may include packages you do not need. In professional projects, keep dependencies intentional and minimal.
Add a .gitignore
.env
__pycache__/
*.pyc
venv/
Basic README Checklist
Your README.md should include:
- Project purpose
- Setup instructions
- Required environment variables
- Run commands
- Example usage
8. Deployment Patterns
Pattern 1: CLI Deployment
Useful for:
- internal automation
- scheduled scripts
- batch processing
You run the Python script directly in a server or job environment.
Pattern 2: Web API Deployment
Useful when you want:
- frontend integration
- external consumers
- service-oriented architecture
Typical stack:
- FastAPI or Flask
- OpenAI SDK
- hosted on a cloud platform
Pattern 3: Container Deployment
Useful for consistency across environments.
A simple Dockerfile might look like this:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
Notes
- Do not bake secrets into the image
- Pass environment variables at runtime
- Use small base images where possible
9. Deployment Best Practices for GenAI Apps
1. Never Hard-Code Secrets
Bad:
client = OpenAI(api_key="my-secret-key")
Good:
import os
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
2. Validate Configuration Early
Fail fast when required settings are missing.
3. Handle API Errors Gracefully
Your app should not crash silently. Log errors and provide useful feedback.
4. Keep Prompts in Maintainable Code
As prompts become more important, consider storing them as constants or in dedicated modules.
5. Start Small, Then Add Monitoring
At minimum, capture:
- request time
- failures
- input type
- model used
Do not log sensitive user content unless explicitly required and permitted.
6. Make the Runtime Reproducible
Use:
- virtual environments
- pinned dependencies
- containers when appropriate
7. Design for Change
Models, prompts, and configuration will evolve. Keep them configurable.
10. Mini Practice Challenge
Convert the summarizer into a deployment-ready "email drafting assistant."
Requirements
- Accept a short prompt such as:
"Write a polite follow-up email after a job interview." - Return:
- subject line
- greeting
- body
- closing
Starter Function
def draft_email(client, model, user_request: str) -> str:
"""
Generate a professional email draft based on the user's request.
"""
response = client.responses.create(
model=model,
input=[
{
"role": "system",
"content": [
{
"type": "input_text",
"text": (
"You are a professional email assistant. "
"Write a clear email with a subject line, greeting, "
"body, and closing."
),
}
],
},
{
"role": "user",
"content": [
{
"type": "input_text",
"text": user_request,
}
],
},
],
)
return response.output_text
Example Usage
client = OpenAI(api_key="YOUR_API_KEY")
result = draft_email(client, "gpt-5.4-mini", "Write a polite follow-up email after a job interview.")
print(result)
Example Output
Subject: Thank You for the Interview Opportunity
Dear [Interviewer Name],
Thank you for taking the time to meet with me during the interview. I appreciated the opportunity to learn more about the role and your team.
Our conversation reinforced my enthusiasm for the position, and I remain very interested in contributing to your organization.
Best regards,
[Your Name]
11. Recap
In this session, you learned how to:
- Think about deployment as more than just running code
- Structure a simple GenAI Python project
- Use environment variables for secure configuration
- Build a deployable app with the OpenAI Responses API
- Improve reliability with logging and validation
- Prepare an app for local, cloud, or container deployment
Useful Resources
- OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
- OpenAI API docs: https://platform.openai.com/docs
- OpenAI Python SDK: https://github.com/openai/openai-python
- python-dotenv documentation: https://pypi.org/project/python-dotenv/
- Docker documentation: https://docs.docker.com/
- FastAPI documentation: https://fastapi.tiangolo.com/
Suggested Instructor Flow
0–10 min
Introduce deployment concepts and architecture.
10–20 min
Walk through project structure, environment management, and configuration.
20–35 min
Build and run the summarizer app using the Responses API.
35–42 min
Discuss deployment patterns, Docker, and best practices.
42–45 min
Recap, Q&A, and mini challenge.
End-of-Session Checklist
- [ ] I can explain what makes a GenAI app deployable
- [ ] I can use environment variables for API credentials
- [ ] I can call
gpt-5.4-miniusing the OpenAI Responses API in Python - [ ] I can create a basic deployment-ready project structure
- [ ] I understand key deployment best practices for GenAI applications