Documentation
how-to/troubleshoot-dynaplex.md
title: Troubleshooting Guide
Overview
This guide provides diagnostic and troubleshooting tips for the Dynaplex system.
- Service Health - Check if services are running correctly
- Log Analysis - Parse and analyze service logs
- Connection Issues - Debug database, MQTT, and HTTP connections
- Performance - Identify slow operations and bottlenecks
- Error Tracking - Find and analyze errors in production
Common Issues Quick Reference
Issue 1: Service Won't Start
Symptoms:
- Service fails to start in Aspire
- Container app shows "Provisioning" state
- Health checks failing
Common causes:
- Missing configuration: Check
appsettings.jsonand environment variables - Database not ready: Ensure db-manager completed migrations
- Port conflict: Check if port is already in use
- Dependency not started: Check service dependencies in Aspire
Issue 2: Database Connection Errors
Symptoms:
- "Unable to connect to database" errors
- Timeout exceptions
- NullReferenceException in data access
Common causes:
- Wrong connection string format (SQL Server vs PostgreSQL)
- Database server not running
- Firewall blocking connection
- SSL/TLS certificate issues
- Service lifetime error (scoped service in singleton)
Issue 3: Service Lifetime Error
Error: Cannot consume scoped service 'DbContext' from singleton 'IHostedService'
Root Cause: Hosted service (singleton) trying to inject scoped service directly
Fix:
// Change from:
public MyProcessor(ILogger logger, DbContext db) { }
// To:
public MyProcessor(ILogger logger, IServiceProvider serviceProvider) {
_serviceProvider = serviceProvider;
}
// Access scoped service:
await using var scope = _serviceProvider.CreateAsyncScope();
var db = scope.ServiceProvider.GetRequiredService<DbContext>();
Issue 4: PostgreSQL DateTime Error
Error: Cannot write DateTimeOffset with Offset=-04:00:00 to PostgreSQL
Root Cause: Using DateTimeOffset with non-UTC offset for PostgreSQL timestamp
Fix:
// Change property type:
public DateTime Timestamp { get; set; } // Not DateTimeOffset
// Convert when setting:
entity.Timestamp = source.UtcDateTime;
entity.CreatedAt = DateTime.UtcNow;
Issue 5: MQTT Session Takeover
Symptoms: Repeated connect/disconnect cycle
Root Cause: Multiple instances with same client ID
Fix:
{
"Mqtt": {
"AppendUniqueIdToClientId": true
}
}
Issue 6: Duplicate Message Processing
Symptoms: Same message processed by multiple instances
Root Cause: Not using shared subscriptions
Fix:
{
"Mqtt": {
"SharedSubscriptionGroup": "my-processors"
}
}
Issue 7: Messages Lost During Disconnect
Symptoms: Messages not delivered after reconnection
Root Cause: Using CleanSession=true
Fix:
{
"Mqtt": {
"CleanSession": false
}
}
Issue 8: Port Binding Conflicts
Error: listen tcp [::1]:5000: bind: address already in use
Root Cause: Hardcoded port in launchSettings.json or AppHost
Fix:
- Remove
applicationUrlfrom launchSettings.json - Remove
targetPortfrom AppHost registration - Let Aspire assign ports dynamically
Common Port Conflicts:
- Port 5000: macOS AirPlay Receiver
- Port 8080: Common development servers
- Port 3000: Node.js development servers
Issue 9: Slow Performance
Symptoms:
- API requests taking too long
- High CPU/memory usage
- Database query timeouts
Common causes:
- Missing database indexes
- N+1 query problems
- Large result sets loaded into memory
- Synchronous I/O blocking threads
- Inefficient LINQ queries
Issue 10: Build Errors After Refactoring
Common Causes:
- Missing using statements
- Namespace changes not updated
- Project references broken
- Missing NuGet packages
Systematic Fix:
- Check using statements
- Verify namespaces match folders
- Validate project references
- Restore NuGet packages
Troubleshooting Workflow
Step 1: Identify the Problem
- What is the symptom?
- When did it start?
- Is it consistent or intermittent?
- Which service(s) are affected?
Step 2: Gather Information
- Check service health status
- Review recent logs
- Check distributed traces
- Look for related errors
Step 3: Form Hypothesis
- What could cause this symptom?
- Have there been recent changes?
- Are there any patterns?
Step 4: Test Hypothesis
- Check configuration
- Test connections
- Review code changes
- Run diagnostic scripts
Step 5: Apply Fix
- Make minimal changes
- Test in development first
- Deploy to staging
- Monitor after deployment
Step 6: Verify Resolution
- Check service health
- Monitor logs for errors
- Run smoke tests
- Document the fix
Quick Diagnostic Commands
# Build errors
dotnet build --no-restore 2>&1 | grep error
# Find project references
grep -r "ProjectReference" --include="*.csproj"
# Check namespace consistency
grep -r "^namespace" --include="*.cs" | sort
# Check MQTT config
cat appsettings.json | grep -A 20 "Mqtt"
Tips for Effective Diagnostics
- Start with service health: Check if all services are running
- Use distributed tracing: Follow request path through system
- Correlate logs: Use operation IDs to connect related logs
- Check recent changes: What changed before the issue started?
- Reproduce locally: Try to reproduce in development
- Narrow the scope: Isolate which service has the problem
- Read error messages: They often contain the root cause
- Check configuration: Many issues are config-related
References
External documentation: