Documentation
fsds/transport-duplicate-delivery-id-fix.md
Duplicate Delivery ID Error - Diagnosis and Fix Plan
Problem Summary
PostgreSQL error 23505: duplicate key value violates unique constraint "IX_deliveries_shipment_id_delivery_number_tenant_id" occurs during BBU component's Oracle/OBLPN processing.
Root Cause Analysis
Finding 1: Dead Code Identified
There are two methods in OracleProcessor.cs that upsert shipments:
| Method | Lines | Status | Duplicate Handling |
|---|---|---|---|
UpsertShipmentWithContextAsync |
1809-1962 | ACTIVE (called at line 1202) | Partial - has try-catch but retry is unprotected |
UpsertShipmentAsync |
2787-2960 | DEAD CODE (never called) | None |
Finding 2: Unprotected Retry in Active Code Path
In UpsertShipmentWithContextAsync, the duplicate handling has a vulnerability:
// Line 1909-1927
try {
deliveriesByOrder = await EnsureShipmentDeliveriesAsync(...);
await transportDb.SaveChangesAsync(cancellationToken); // ← First attempt
} catch(DbUpdateException ex) when (...) {
// Clear tracker, re-fetch shipment
deliveriesByOrder = await EnsureShipmentDeliveriesAsync(...);
await transportDb.SaveChangesAsync(cancellationToken); // ← RETRY HAS NO CATCH!
}
The Bug: Line 1926 (retry SaveChangesAsync) has no try-catch. If another racing process creates a new delivery between the re-fetch and retry save, the exception propagates unhandled.
Finding 3: Race Condition Scenario
- Process A: First attempt fails on delivery D1 (23505)
- Process A: Catches error, clears tracker, re-fetches shipment (now includes D1)
- Process C: Meanwhile creates delivery D2 (committed to DB)
- Process A:
EnsureShipmentDeliveriesAsyncdoesn't see D2 (not yet visible during re-fetch) - Process A: Retry creates D2, calls SaveChanges
- Boom: 23505 on D2 with no catch - exception propagates!
Finding 4: Database Evidence
Same delivery numbers across multiple shipments created within milliseconds:
delivery_number | shipment_id | status_updated_at_utc
120920253151003 | 019afae1-8213-7537-959a-ed54b57f3f03 | 2025-12-07 22:14:34.228112
120920253151003 | 019afae1-8210-71dd-bf5c-eff512ca48fe | 2025-12-07 22:14:34.220680
Fix Plan
Step 1: Remove Dead Code
File: engines/bbu/src/Acsis.Dynaplex.Engines.Bbu/Services/OracleProcessor.cs
Delete the entire UpsertShipmentAsync method (lines ~2782-2960). It's never called and just adds confusion and maintenance burden.
Step 2: Add Retry Loop to Handle Cascading Race Conditions
File: engines/bbu/src/Acsis.Dynaplex.Engines.Bbu/Services/OracleProcessor.cs
Replace the single try-catch with a retry loop that handles multiple consecutive race conditions:
Current code (lines 1905-1927):
Dictionary<string, Delivery> deliveriesByOrder;
// Handle race condition: multiple parallel processors may try to create the same delivery
// If we hit a duplicate key error, clear the tracker and retry once
try {
deliveriesByOrder = await EnsureShipmentDeliveriesAsync(transportDb, shipment!, deliveries, prismDb, cancellationToken);
await transportDb.SaveChangesAsync(cancellationToken);
} catch(DbUpdateException ex) when (ex.InnerException is PostgresException { SqlState: "23505" } pgEx
&& pgEx.ConstraintName?.Contains("deliveries") == true) {
_logger.LogWarning("Duplicate delivery detected for shipment {Oblpn}, retrying with fresh data", oblpn);
// Clear the change tracker to remove conflicting entities
transportDb.ChangeTracker.Clear();
// Re-fetch the shipment with deliveries from database
shipment = await transportDb.Shipments
.Include(s => s.Deliveries)
.FirstAsync(s => s.Id == shipment!.Id, cancellationToken);
// Retry - this time EnsureShipmentDeliveriesAsync will find the existing deliveries
deliveriesByOrder = await EnsureShipmentDeliveriesAsync(transportDb, shipment, deliveries, prismDb, cancellationToken);
await transportDb.SaveChangesAsync(cancellationToken);
}
Updated code with retry loop:
Dictionary<string, Delivery> deliveriesByOrder;
// Handle race conditions: multiple parallel processors may try to create the same delivery
// Use a retry loop to handle cascading race conditions under high parallelism
const int maxRetries = 3;
for(var attempt = 1; attempt <= maxRetries; attempt++) {
try {
deliveriesByOrder = await EnsureShipmentDeliveriesAsync(transportDb, shipment!, deliveries, prismDb, cancellationToken);
await transportDb.SaveChangesAsync(cancellationToken);
break; // Success - exit loop
} catch(DbUpdateException ex) when (
attempt < maxRetries &&
ex.InnerException is PostgresException { SqlState: "23505" } pgEx &&
pgEx.ConstraintName?.Contains("deliveries") == true) {
_logger.LogWarning(
"Duplicate delivery detected for shipment {Oblpn} (attempt {Attempt}/{MaxRetries}), retrying with fresh data",
oblpn, attempt, maxRetries);
// Clear the change tracker to remove conflicting entities
transportDb.ChangeTracker.Clear();
// Re-fetch the shipment with deliveries from database
shipment = await transportDb.Shipments
.Include(s => s.Deliveries)
.FirstAsync(s => s.Id == shipment!.Id, cancellationToken);
}
}
Files to Modify
engines/bbu/src/Acsis.Dynaplex.Engines.Bbu/Services/OracleProcessor.cs- Delete
UpsertShipmentAsyncmethod (lines ~2782-2960) - dead code cleanup - Replace try-catch with retry loop (lines ~1905-1927)
- Delete
Testing
- Run the BBU component with parallel OBLPN message processing
- Verify no more 23505 errors on deliveries
- Confirm the log shows retry attempts under high load
- Verify deliveries are created correctly without duplicates
Summary
- Root cause: The single retry in the catch block had no protection against cascading race conditions
- Dead code:
UpsertShipmentAsyncis never called and should be removed - Fix: Replace single try-catch with a retry loop (max 3 attempts) to handle multiple racing processes