Documentation

fsds/bbu-rfid-azure-deployment-auth-fix-2.md

Azure Deployment Authentication Fixes Plan

Problem Summary

Two components are failing in Azure deployment due to authentication issues:

  1. IoT Component: Blob Storage authorization failure (403)

    • Error: "This request is not authorized to perform this operation"
    • Location: DatalogicBlobProcessor.CheckForNewBlobsAsync()
    • Attempting to call CreateIfNotExistsAsync on blob container
  2. BBU Component: PostgreSQL password authentication failure

    • Error: "password authentication failed for user 'bbu_identity-aszxig73dxx3m'"
    • Location: BbuMqttProcessor / ZebraRfidProcessor when saving to database
    • Connection to: psql-elbbudev-db.postgres.database.azure.com:5432

Investigation Findings

IoT Component - DatalogicBlobProcessor

Current Configuration:

  • Location: engines/iot/src/Acsis.Dynaplex.Engines.Iot.Abstractions/Services/DatalogicBlobProcessor.cs
  • Authentication: Uses hardcoded connection string in appsettings.json
  • Storage Account: saacsiscorpbbuatdev
  • Container: bbu-image
  • Issue: 403 Forbidden when calling CreateIfNotExistsAsync()

Root Cause:
The component has a connection string with an embedded storage account key in appsettings.json, but either:

  1. The storage account key has been rotated/invalidated in Azure
  2. The storage account doesn't exist or has different permissions
  3. The container doesn't exist and the identity lacks permissions to create it

Architecture Details:

  • Dual authentication strategy: prioritizes connection string, falls back to injected BlobServiceClient
  • Aspire integration available but commented out: builder.AddAzureBlobServiceClient("blobs")
  • Security concern: Storage account key hardcoded in appsettings.json

BBU Component - PostgreSQL Authentication

Current Configuration:

  • Database: Azure PostgreSQL Flexible Server (PostgreSQL 18 Preview)
  • Authentication: Password-based (Entra ID disabled due to PG 18 preview limitations)
  • Connection string stored in: Key Vault secret connectionstrings--acsis
  • User: Should be admin user (e.g., pgadmin), NOT the managed identity name

Root Cause:
The error message shows: password authentication failed for user "bbu_identity-aszxig73dxx3m"

This suggests the application is trying to connect using the managed identity name as the database username, when it should be using the admin username from the Key Vault connection string.

Possible causes:

  1. Connection string in Key Vault is malformed or missing username/password
  2. Connection string is using the wrong username
  3. PostgreSQL user doesn't exist or password is incorrect
  4. Environment variable mapping issue between Key Vault secret and application

Architecture Details:

  • Connection string retrieved via managed identity from Key Vault
  • Managed identity bbu_identity-aszxig73dxx3m has KeyVault Secrets User role
  • Container App environment variable: ConnectionStrings__acsis
  • Uses .NET Aspire's AddAzureNpgsqlDbContext for configuration

Based on investigation and user requirements:

  • Priority: Get services working with password auth (PG 18 preview limitation)
  • IoT: Remove hardcoded secrets, migrate to managed identity (security requirement)
  • BBU: Use standard password authentication (no Entra ID until PG 18 GA)

Problem 1: IoT Component - Blob Storage Authentication

Root Cause Confirmed:

  1. Storage account has allowSharedKeyAccess: false (connection string auth disabled)
  2. IoT managed identity has no RBAC role assignments (missing iot-roles-storage module)
  3. Hardcoded storage account key in appsettings.json (security issue)

Solution:

  1. Create iot-roles-storage.module.bicep with RBAC assignments
  2. Update main.bicep to deploy the new module
  3. Uncomment Aspire blob storage integration in IoT Program.cs
  4. Add storage reference in AppHost.cs for IoT component
  5. Remove hardcoded connection string from appsettings.json
  6. Update iot.module.bicep to pass blob endpoint as environment variable

Problem 2: BBU Component - PostgreSQL Authentication

Root Cause Confirmed:

  • Code uses AddAzureNpgsqlDbContext (line 166 in Extensions.cs)
  • This configures Npgsql for Azure Entra ID authentication with managed identity
  • Overrides connection string username with managed identity name
  • PostgreSQL has activeDirectoryAuth: 'Disabled' (password-only, PG 18 preview limitation)
  • Result: Tries to authenticate as bbu_identity-aszxig73dxx3m instead of admin user

Solution:

  1. Change AddAzureNpgsqlDbContext to AddNpgsqlDbContext in Extensions.cs
  2. This switches to standard password authentication using connection string from Key Vault
  3. Remove Azure-specific configuration that injects managed identity
  4. Keep existing Key Vault secret retrieval (already working)
  5. When PG 18 reaches GA with Entra ID support, can migrate back to Azure variant

Implementation Plan

Step 1: Fix BBU PostgreSQL Authentication (15 minutes)

File: strata/service-defaults/src/Acsis.Dynaplex.Strata.ServiceDefaults/Extensions.cs

Change (Line 166):

// BEFORE (Azure Entra ID auth):
builder.AddAzureNpgsqlDbContext<TContext>(connectionName,
    configureDbContextOptions: options => {
        options.UseNpgsql(opts => { opts.MigrationsHistoryTable("ef_migration_history", schemaName); });
    }
);

// AFTER (Standard password auth):
builder.AddNpgsqlDbContext<TContext>(connectionName,
    configureDbContextOptions: options => {
        options.UseNpgsql(opts => { opts.MigrationsHistoryTable("ef_migration_history", schemaName); });
    }
);

Rationale:

  • AddNpgsqlDbContext uses connection string credentials directly (username/password from Key Vault)
  • AddAzureNpgsqlDbContext adds managed identity token provider that overrides username
  • PostgreSQL 18 preview doesn't support Entra ID, so password auth is required

Impact: Affects all components using AddAcsisDbContext (BBU, Catalog, CoreData, etc.)

Step 2: Fix IoT Blob Storage Authentication (45 minutes)

2.1: Create IoT Storage RBAC Module

File: projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/iot-roles-storage/iot-roles-storage.module.bicep (NEW)

Copy from bbu-roles-storage/bbu-roles-storage.module.bicep and modify:

  • Change parameter bbu_identity_outputs_principalId to iot_identity_outputs_principalId
  • Change all bbu references to iot
  • Assign same three roles: Storage Blob Data Contributor, Storage Table Data Contributor, Storage Queue Contributor

2.2: Update Main Infrastructure

File: projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/main.bicep

Add after line where bbu_roles_storage module is defined:

module iot_roles_storage './iot-roles-storage/iot-roles-storage.module.bicep' = {
  name: 'iot-roles-storage'
  scope: rg
  params: {
    storage_outputs_name: storage.outputs.name
    iot_identity_outputs_principalId: iot_identity.outputs.principalId
  }
}

Update IoT module call to pass blob endpoint:

module iot './iot/iot.module.bicep' = {
  // ... existing params ...
  storage_outputs_blobEndpoint: storage.outputs.blobEndpoint
}

2.3: Update IoT Module

File: projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/iot/iot.module.bicep

Add parameter:

param storage_outputs_blobEndpoint string

Add to container environment variables:

{
  name: 'ConnectionStrings__blobs'
  value: storage_outputs_blobEndpoint
}

2.4: Update IoT Program.cs

File: engines/iot/src/Acsis.Dynaplex.Engines.Iot/Program.cs

Uncomment lines ~30-32 (Azure Blob Storage client):

if(!isOpenApiBuild) {
    builder.AddAzureBlobServiceClient("blobs");
}

2.5: Remove Hardcoded Connection String

File: engines/iot/src/Acsis.Dynaplex.Engines.Iot/appsettings.json

Remove line 98 or set to null:

"BlobStorageConnectionString": null,

Security Note: The exposed storage account key should be rotated in Azure after this fix is deployed.

2.6: Update AppHost (if needed)

File: projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/AppHost.cs

Verify IoT component has storage reference (should be added by UseDynaplex and AddDynaplexStorage)

Step 3: Deploy and Test (15 minutes)

  1. Build solution to ensure no compilation errors
  2. Run cleanup script: ./scripts/cleanup-bicep-outputs.sh
  3. Deploy infrastructure: azd deploy
  4. Monitor logs:
    • BBU: az containerapp logs show --name ca-elbbudev-bbu-com --resource-group rg-elbbudev --tail 50
    • IoT: az containerapp logs show --name ca-elbbudev-iot-com --resource-group rg-elbbudev --tail 50
  5. Verify success:
    • BBU: Should connect to PostgreSQL successfully, no more "bbu_identity-aszxig73dxx3m" errors
    • IoT: Should access blob storage with managed identity, no more 403 errors

Files to Modify

Critical Files (Must Change):

  1. strata/service-defaults/src/Acsis.Dynaplex.Strata.ServiceDefaults/Extensions.cs (BBU fix, line 166)
  2. projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/iot-roles-storage/iot-roles-storage.module.bicep (NEW - IoT RBAC)
  3. projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/main.bicep (Add IoT storage module)
  4. projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/iot/iot.module.bicep (Add blob endpoint env var)
  5. engines/iot/src/Acsis.Dynaplex.Engines.Iot/Program.cs (Uncomment Aspire blob client)
  6. engines/iot/src/Acsis.Dynaplex.Engines.Iot/appsettings.json (Remove hardcoded connection string)

Reference Files (For Pattern Guidance):

  • projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/bbu-roles-storage/bbu-roles-storage.module.bicep (Copy for IoT)
  • projects/bbu-rfid/src/Acsis.Dynaplex.Projects.BbuRfid/infra/bbu/bbu.module.bicep (Reference for env var pattern)

Risk Assessment

Low Risk:

  • BBU PostgreSQL fix: Simple method name change, standard pattern
  • IoT RBAC module: Copying existing working pattern

Medium Risk:

  • Removing hardcoded connection string: Ensure Aspire integration is properly configured
  • All components share Extensions.cs: Change affects all services (but all use password auth)

Mitigation:

  • Test locally first if possible
  • Deploy during maintenance window
  • Monitor logs immediately after deployment
  • Have rollback plan (revert Extensions.cs change if needed)

Success Criteria

  • BBU component connects to PostgreSQL successfully
  • No more errors about "bbu_identity-aszxig73dxx3m" in BBU logs
  • IoT component accesses blob storage successfully
  • No more 403 errors in IoT logs
  • No hardcoded secrets in appsettings.json
  • All RBAC permissions properly assigned
  • Build completes successfully
  • All services start without errors

Future Improvements

  1. When PostgreSQL 18 reaches GA with Entra ID support:

    • Revert Extensions.cs to use AddAzureNpgsqlDbContext
    • Update Bicep: activeDirectoryAuth: 'Enabled'
    • Migrate from password to managed identity authentication
    • Remove password from Key Vault
  2. Storage Account Key Rotation:

    • After deploying this fix, rotate the exposed storage account key
    • Verify no services use connection string authentication
    • Consider Azure Policy to enforce allowSharedKeyAccess: false