Session 3: Conflict Resolution and Coordination Strategies

Synopsis

Introduces methods for reconciling competing outputs, selecting between proposals, and coordinating sequential or parallel work. Learners see how governance and arbitration become essential in multi-agent environments.

Session Content

Session 3: Conflict Resolution and Coordination Strategies

Session Overview

Duration: ~45 minutes
Audience: Python developers with basic programming knowledge, learning GenAI and agentic development
Focus: Understanding how multiple agents coordinate, how conflicts emerge, and how to implement practical conflict resolution strategies using the OpenAI Responses API and gpt-5.4-mini

Learning Objectives

By the end of this session, learners will be able to:

Explain why conflicts occur in multi-agent and agentic systems
Identify common coordination failures such as contradiction, duplication, deadlock, and goal misalignment
Apply conflict resolution strategies including prioritization, voting, arbitration, and rule-based reconciliation
Implement a simple Python-based coordinator that compares agent outputs and resolves disagreements
Build a practical orchestration workflow using the OpenAI Responses API

1. Why Conflict Resolution Matters in Agentic Systems

As soon as multiple agents collaborate, disagreement becomes normal rather than exceptional. In an agentic workflow, one agent may generate a plan, another may review it, and a third may optimize for time, cost, or safety. These agents can produce incompatible recommendations.

Common Sources of Conflict

Different objectives
One agent optimizes for speed
Another optimizes for quality
Another optimizes for compliance or safety
Different context windows
Agents may see different subsets of information
One agent may miss a critical constraint
Prompt framing differences
Slight differences in instructions can lead to contradictory outputs
Ambiguity in task ownership
Multiple agents may solve the same subproblem in incompatible ways
Stale state
An agent may reason over outdated information while others use updated state

Examples of Coordination Failures

Contradiction: Agent A says “approve the refund,” Agent B says “deny the refund”
Duplication: Two agents perform the same task unnecessarily
Deadlock: Each agent waits for another to decide
Priority inversion: A low-priority optimization overrides a critical safety rule
Hallucinated consensus: A coordinator assumes agreement where none exists

2. Coordination Patterns in Multi-Agent Systems

Before resolving conflicts, it helps to understand common coordination patterns.

A. Centralized Coordinator

A single orchestrator collects outputs from agents and makes a final decision.

Pros - Easier to debug - Clear decision authority - Good for production workflows

Cons - Single point of failure - Coordinator quality strongly affects system performance

B. Peer-to-Peer Negotiation

Agents communicate directly and attempt to reconcile differences.

Pros - Flexible - Closer to distributed systems patterns

Cons - Harder to control - Can become expensive or unstable

C. Hierarchical Delegation

A parent agent delegates to specialized child agents, then integrates results.

Pros - Natural task decomposition - Clear responsibility boundaries

Cons - Requires good task design - Parent may still face conflicting recommendations

D. Voting or Ensemble Decision

Several agents independently solve a problem and the system chooses a majority or weighted result.

Pros - Useful for robustness - Reduces dependence on one output

Cons - Majority can still be wrong - Hard to apply when outputs are open-ended

3. Practical Conflict Resolution Strategies

3.1 Rule-Based Prioritization

Use explicit rules to determine which output wins.

Examples: - Safety overrides cost - Compliance overrides convenience - User instruction overrides stylistic preferences

This is often the most practical strategy in production.

3.2 Scoring and Ranking

Assign each proposal a score across dimensions such as: - correctness - cost - latency - safety - alignment with user intent

Then choose the highest total score or the best constrained option.

3.3 Arbitration Agent

Use a separate “arbiter” agent to compare competing proposals and choose one.

This is useful when the conflict requires reasoning rather than static rules.

3.4 Voting

Applicable when multiple agents produce candidate answers in similar formats.

Common methods: - simple majority - weighted majority - confidence-based voting

3.5 Merge-and-Rewrite

Instead of choosing one output, synthesize a unified solution.

Best when: - each proposal is partially correct - tradeoffs can be balanced - a final polished result is needed

3.6 Escalation

If conflict cannot be resolved confidently: - ask a human - request more information - re-run with tighter prompts - trigger a fallback policy

4. Designing a Coordinator

A practical coordinator usually performs these steps:

Collect outputs from specialized agents
Normalize them into a consistent structure
Detect disagreement
Apply decision policy
Produce final output and decision rationale
Log the process for observability

Recommended Output Structure for Agents

To coordinate effectively, ask each agent to return structured fields such as:

recommendation
reasoning_summary
priority
risks
confidence

This makes comparison easier than using free-form text.

5. Hands-On Exercise 1: Compare Two Agents with Different Priorities

Goal

Create two specialized agents: - a speed-focused agent - a quality-focused agent

Then compare their recommendations for the same task.

What You Will Learn

How different prompts create divergent recommendations
How to call the OpenAI Responses API using Python
How to inspect outputs before reconciliation

Setup

Install dependencies:

pip install openai python-dotenv

Create a .env file:

OPENAI_API_KEY=your_api_key_here

Python Code

"""
Exercise 1: Comparing two specialized agent recommendations.

This script uses the OpenAI Responses API with gpt-5.4-mini to generate
two different task execution recommendations:
1. A speed-focused recommendation
2. A quality-focused recommendation

The goal is to observe how agent specialization creates conflict.
"""

from openai import OpenAI
from dotenv import load_dotenv
import os

# Load environment variables from .env
load_dotenv()

# Initialize the OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Shared task for both agents
TASK = """
A customer asks for a summary report of sales performance by region.
The report is needed today, but executives will use it for an important decision.
How should the task be handled?
"""

def run_agent(system_prompt: str, user_prompt: str) -> str:
    """
    Calls the OpenAI Responses API and returns the model's text output.

    Args:
        system_prompt: Instructions defining the agent's behavior
        user_prompt: The actual task input

    Returns:
        The model's response as text
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ],
    )
    return response.output_text

# Define specialized agents
speed_agent_prompt = """
You are a speed-optimized operations agent.
Prioritize fast delivery, minimal process overhead, and immediate action.
Return your answer in this format:

Recommendation: ...
Reasoning Summary: ...
Priority: speed
Risks: ...
Confidence: ...
"""

quality_agent_prompt = """
You are a quality-optimized operations agent.
Prioritize accuracy, validation, and decision-grade output quality.
Return your answer in this format:

Recommendation: ...
Reasoning Summary: ...
Priority: quality
Risks: ...
Confidence: ...
"""

# Run both agents
speed_result = run_agent(speed_agent_prompt, TASK)
quality_result = run_agent(quality_agent_prompt, TASK)

# Print results for comparison
print("=" * 80)
print("SPEED AGENT OUTPUT")
print("=" * 80)
print(speed_result)

print("\n" + "=" * 80)
print("QUALITY AGENT OUTPUT")
print("=" * 80)
print(quality_result)

Example Output

================================================================================
SPEED AGENT OUTPUT
================================================================================
Recommendation: Produce a same-day summary using available sales data and clearly mark it as a preliminary report.
Reasoning Summary: Executives need the report today, so immediate delivery is more valuable than waiting for full validation.
Priority: speed
Risks: Some regional figures may contain unverified discrepancies.
Confidence: High

================================================================================
QUALITY AGENT OUTPUT
================================================================================
Recommendation: Validate regional data sources before producing the report, and deliver a decision-grade summary with a short explanation of methodology.
Reasoning Summary: Because executives will make an important decision using the report, accuracy is critical.
Priority: quality
Risks: Delivery may be delayed if validation reveals data inconsistencies.
Confidence: High

Discussion Prompts

What is the conflict between the two outputs?
Which recommendation should be preferred in a high-stakes business setting?
Can both be partially right?

6. Detecting Conflict Programmatically

To resolve disagreements, we first need to detect them.

Signs of Conflict

Opposite actions: “ship now” vs “validate first”
Different priorities: speed vs quality
Different risk tolerances
Incompatible next steps

In practice, conflict detection can be: - simple keyword/rule-based logic - structured field comparison - LLM-based semantic arbitration

7. Hands-On Exercise 2: Build a Rule-Based Conflict Resolver

Goal

Build a Python coordinator that: - gets agent outputs - checks for priority conflict - applies a simple rule: - if the task is high-stakes, quality wins - otherwise, speed wins

What You Will Learn

How to orchestrate multiple model calls
How to implement a deterministic resolution policy
Why explicit rules are valuable in production systems

Python Code

"""
Exercise 2: Rule-based conflict resolution.

This script:
1. Queries two specialized agents
2. Uses a simple coordinator policy
3. Resolves the disagreement deterministically

The policy is:
- If the task is high-stakes, prefer the quality-focused output
- Otherwise, prefer the speed-focused output
"""

from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

TASK = """
A customer asks for a summary report of sales performance by region.
The report is needed today, but executives will use it for an important decision.
How should the task be handled?
"""

def run_agent(system_prompt: str, user_prompt: str) -> str:
    """
    Execute one agent prompt and return the text response.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ],
    )
    return response.output_text

def is_high_stakes(task_text: str) -> bool:
    """
    Very simple rule-based classifier for task criticality.

    Args:
        task_text: The task description

    Returns:
        True if high-stakes language is detected, else False
    """
    high_stakes_keywords = [
        "important decision",
        "executives",
        "compliance",
        "legal",
        "safety",
        "financial",
        "medical",
    ]
    lowered = task_text.lower()
    return any(keyword in lowered for keyword in high_stakes_keywords)

def resolve_conflict(speed_output: str, quality_output: str, task_text: str) -> str:
    """
    Resolve conflict using a deterministic policy.

    Args:
        speed_output: Output from the speed-focused agent
        quality_output: Output from the quality-focused agent
        task_text: Original task description

    Returns:
        The selected final recommendation
    """
    if is_high_stakes(task_text):
        return quality_output
    return speed_output

speed_agent_prompt = """
You are a speed-optimized operations agent.
Prioritize fast delivery, minimal process overhead, and immediate action.
Return your answer in this format:

Recommendation: ...
Reasoning Summary: ...
Priority: speed
Risks: ...
Confidence: ...
"""

quality_agent_prompt = """
You are a quality-optimized operations agent.
Prioritize accuracy, validation, and decision-grade output quality.
Return your answer in this format:

Recommendation: ...
Reasoning Summary: ...
Priority: quality
Risks: ...
Confidence: ...
"""

# Run both agents
speed_output = run_agent(speed_agent_prompt, TASK)
quality_output = run_agent(quality_agent_prompt, TASK)

# Resolve with rule-based coordinator
final_decision = resolve_conflict(speed_output, quality_output, TASK)

print("=" * 80)
print("TASK")
print("=" * 80)
print(TASK.strip())

print("\n" + "=" * 80)
print("FINAL DECISION")
print("=" * 80)
print(final_decision)

Example Output

================================================================================
TASK
================================================================================
A customer asks for a summary report of sales performance by region.
The report is needed today, but executives will use it for an important decision.
How should the task be handled?

================================================================================
FINAL DECISION
================================================================================
Recommendation: Validate regional data sources before producing the report, and deliver a decision-grade summary with a short explanation of methodology.
Reasoning Summary: Because executives will make an important decision using the report, accuracy is critical.
Priority: quality
Risks: Delivery may be delayed if validation reveals data inconsistencies.
Confidence: High

Extension Ideas

Add more rules for compliance, cost, and urgency
Parse agent outputs into dictionaries for easier comparison
Store decisions in logs for auditing

8. Arbitration with an LLM

Static rules are great, but some disagreements require nuanced reasoning.

An arbiter agent can: - read all candidate proposals - compare tradeoffs - choose the best option - explain why

This is especially useful when: - outputs are semantically different - there are multiple competing goals - rigid rules are too simplistic

9. Hands-On Exercise 3: Add an Arbiter Agent

Goal

Use a third LLM call as an arbitration agent that reviews two competing proposals and produces a final coordinated recommendation.

What You Will Learn

How to build a three-agent pattern
How to use an LLM as a decision-maker
How to request a structured final resolution

Python Code

"""
Exercise 3: LLM-based arbitration.

This script:
1. Generates two competing recommendations
2. Sends both to an arbitration agent
3. Produces a final coordinated recommendation

This pattern is useful when conflict resolution requires judgment
rather than simple deterministic rules.
"""

from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

TASK = """
A customer asks for a summary report of sales performance by region.
The report is needed today, but executives will use it for an important decision.
How should the task be handled?
"""

def run_agent(system_prompt: str, user_prompt: str) -> str:
    """
    Execute an agent with a system prompt and user prompt.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ],
    )
    return response.output_text

speed_agent_prompt = """
You are a speed-optimized operations agent.
Prioritize fast delivery, minimal process overhead, and immediate action.
Return your answer in this format:

Recommendation: ...
Reasoning Summary: ...
Priority: speed
Risks: ...
Confidence: ...
"""

quality_agent_prompt = """
You are a quality-optimized operations agent.
Prioritize accuracy, validation, and decision-grade output quality.
Return your answer in this format:

Recommendation: ...
Reasoning Summary: ...
Priority: quality
Risks: ...
Confidence: ...
"""

arbiter_prompt = """
You are an arbitration agent for a multi-agent system.

Your job:
- Compare the two proposals
- Identify the central conflict
- Choose the best option or merge them
- Prefer safer and higher-quality outcomes when stakes are high
- Return a concise final decision

Return your answer in this format:

Conflict Detected: ...
Decision: ...
Why: ...
Final Recommendation: ...
"""

# Step 1: Generate candidate outputs
speed_output = run_agent(speed_agent_prompt, TASK)
quality_output = run_agent(quality_agent_prompt, TASK)

# Step 2: Ask the arbiter to resolve the conflict
arbiter_input = f"""
Task:
{TASK}

Proposal A:
{speed_output}

Proposal B:
{quality_output}
"""

final_resolution = run_agent(arbiter_prompt, arbiter_input)

print("=" * 80)
print("PROPOSAL A: SPEED AGENT")
print("=" * 80)
print(speed_output)

print("\n" + "=" * 80)
print("PROPOSAL B: QUALITY AGENT")
print("=" * 80)
print(quality_output)

print("\n" + "=" * 80)
print("ARBITER DECISION")
print("=" * 80)
print(final_resolution)

Example Output

================================================================================
PROPOSAL A: SPEED AGENT
================================================================================
Recommendation: Produce a same-day summary using available sales data and clearly mark it as a preliminary report.
Reasoning Summary: Executives need the report today, so immediate delivery is more valuable than waiting for full validation.
Priority: speed
Risks: Some regional figures may contain unverified discrepancies.
Confidence: High

================================================================================
PROPOSAL B: QUALITY AGENT
================================================================================
Recommendation: Validate regional data sources before producing the report, and deliver a decision-grade summary with a short explanation of methodology.
Reasoning Summary: Because executives will make an important decision using the report, accuracy is critical.
Priority: quality
Risks: Delivery may be delayed if validation reveals data inconsistencies.
Confidence: High

================================================================================
ARBITER DECISION
================================================================================
Conflict Detected: The speed-focused proposal prioritizes immediacy, while the quality-focused proposal prioritizes accuracy for a high-stakes executive decision.
Decision: Merge both approaches with a staged delivery plan.
Why: Executives need timely visibility, but the decision context requires validated numbers before final use.
Final Recommendation: Deliver a clearly labeled preliminary summary today, then follow up with a validated decision-grade report as soon as checks are complete.

Reflection Questions

When is an arbiter better than a fixed rule?
What are the risks of relying on another LLM for conflict resolution?
How could you verify the arbiter’s decision?

10. Best Practices for Conflict Resolution in Agentic Systems

A. Make Agent Roles Explicit

Poorly defined roles create overlap and contradiction.

Instead of: - “Help solve the problem”

Use: - “Optimize for cost” - “Review for compliance” - “Verify data quality”

B. Require Structured Outputs

Structured outputs reduce ambiguity and simplify downstream logic.

Useful fields: - recommendation - assumptions - risks - confidence - unresolved questions

C. Define Resolution Policies Early

Do not wait until production incidents to decide: - what overrides what - when human escalation is required - what constitutes acceptable disagreement

D. Log All Decisions

For debugging and trust, store: - task input - each agent’s output - chosen policy - final decision - reason for resolution

E. Use LLM Arbitration Carefully

LLM arbiters are flexible but not infallible. Consider combining: - deterministic guards - validation checks - arbitration - human review for sensitive cases

11. Mini Design Activity

Scenario

You are building a multi-agent customer support workflow with these agents:

Policy Agent: checks refund rules
Empathy Agent: drafts a supportive response
Fraud Agent: identifies suspicious behavior
Resolution Agent: decides what to do

Task

Design a conflict resolution policy for the following situation:

Policy Agent says the refund is allowed
Fraud Agent says the request is suspicious
Empathy Agent drafts a message promising a refund

Questions to Answer

Which agent should have the highest authority?
Should the system approve, deny, or escalate?
What message should be sent to the customer?
What logs should be stored for auditability?

12. Session Summary

In this session, you learned that conflict is a natural part of multi-agent systems. Rather than avoiding it, good agentic design plans for it explicitly.

Key Takeaways

Multiple agents often disagree because they optimize for different goals
Central coordinators simplify control and observability
Rule-based policies are practical, stable, and easy to audit
Arbitration agents help when conflict requires nuanced judgment
Structured outputs and clear authority rules are essential for reliable coordination

13. Useful Resources

OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
OpenAI API docs: https://platform.openai.com/docs
OpenAI Python SDK: https://github.com/openai/openai-python
Prompt engineering guide: https://platform.openai.com/docs/guides/prompt-engineering
Python dotenv: https://pypi.org/project/python-dotenv/

14. Optional Homework

Homework Task

Build a small multi-agent coordinator for a content publishing workflow with these agents:

SEO Agent
Editorial Quality Agent
Brand Voice Agent
Coordinator

Requirements

Use gpt-5.4-mini
Use the OpenAI Responses API
Have each agent return structured text
Detect at least one conflict
Resolve it using either:
a rule-based method, or
an arbiter agent
Print the final publishing recommendation

Stretch Goal

Log the agent outputs and final decision to a JSON file for review.

Back to Chapter | Back to Master Plan | Previous Session | Next Session

Session 3: Conflict Resolution and Coordination Strategies

Synopsis

Session Content

Session 3: Conflict Resolution and Coordination Strategies

Session Overview

Learning Objectives

1. Why Conflict Resolution Matters in Agentic Systems

Common Sources of Conflict

Examples of Coordination Failures

2. Coordination Patterns in Multi-Agent Systems

A. Centralized Coordinator

B. Peer-to-Peer Negotiation

C. Hierarchical Delegation

D. Voting or Ensemble Decision

3. Practical Conflict Resolution Strategies

3.1 Rule-Based Prioritization

3.2 Scoring and Ranking

3.3 Arbitration Agent

3.4 Voting

3.5 Merge-and-Rewrite

3.6 Escalation

4. Designing a Coordinator

Recommended Output Structure for Agents

5. Hands-On Exercise 1: Compare Two Agents with Different Priorities

Goal

What You Will Learn

Setup

Python Code

Example Output

Discussion Prompts

6. Detecting Conflict Programmatically

Signs of Conflict

7. Hands-On Exercise 2: Build a Rule-Based Conflict Resolver

Goal

What You Will Learn

Python Code

Example Output

Extension Ideas

8. Arbitration with an LLM

9. Hands-On Exercise 3: Add an Arbiter Agent

Goal

What You Will Learn

Python Code

Example Output

Reflection Questions

10. Best Practices for Conflict Resolution in Agentic Systems

A. Make Agent Roles Explicit

B. Require Structured Outputs

C. Define Resolution Policies Early

D. Log All Decisions

E. Use LLM Arbitration Carefully

11. Mini Design Activity

Scenario

Task

Questions to Answer

Suggested Answer Outline

12. Session Summary

Key Takeaways

13. Useful Resources

14. Optional Homework

Homework Task

Requirements

Stretch Goal