Documentation

adrs/057-host-build-docker-compose.md

ADR 057: Host-Build Docker Compose Architecture

Status

Accepted

Context

The Docker Compose deployment builds each of the 17 .NET engine services independently inside Docker. Each docker compose build <engine> invocation:

  1. Transfers the entire repository (~1GB+) as build context
  2. Runs dotnet restore independently, resolving the same NuGet packages 17 times
  3. Compiles all shared Strata dependencies (ServiceDefaults, Core, Http, Abstractions, ApiClients) from scratch
  4. Runs dotnet publish for a single project

This means shared libraries are compiled 17 times instead of once. The result is builds that exhaust system memory and take 10+ minutes even with the batching strategy introduced in compose-build.sh. Each Docker build requires the full .NET SDK image (2GB+), and there is no way to share incremental build state across independent Docker builds.

Additionally, Docker Compose's depends_on directive only waits for a container to start, not for the service inside to be ready. This caused db-manager to attempt connecting to PostgreSQL before it was accepting connections, requiring manual restart.

Decision

Host-Build Strategy

Build all .NET projects once on the host, then use trivial Dockerfiles that just COPY the pre-built output into runtime images. No custom build script is needed -- standard dotnet publish and docker compose build commands work directly.

Aspect Before After
Compilation dotnet publish inside Docker, 17 times dotnet publish on host, once per project (shared compilation)
Shared deps Each build compiles Strata from scratch MSBuild incremental build -- shared deps compile once
Docker image .NET SDK image (2GB+) ASP.NET runtime image (~200MB)
Build time 10+ minutes, memory exhaustion Host publish ~2 min, Docker builds sub-second
Batching Required (3 at a time) Not needed -- Docker builds are trivial COPY
Script required compose-build.sh with batching None -- standard dotnet/docker commands

New workflow:

# 1. Publish all .NET projects on the host (MSBuild shares compiled deps)
dotnet publish acsis-core.slnx -c Release /p:UseAppHost=false

# 2. Docker builds just COPY the pre-built output (~50MB per engine)
docker compose build

# 3. Start
docker compose up -d

New Dockerfile.runtime:

FROM mcr.microsoft.com/dotnet/aspnet:10.0
WORKDIR /app
EXPOSE 8080
ARG ASSEMBLY_NAME
COPY . .
ENV DOTNET_ASSEMBLY=${ASSEMBLY_NAME}
ENTRYPOINT ["sh", "-c", "exec dotnet /app/${DOTNET_ASSEMBLY}"]

The orchestration code (ConfigureDockerComposeBuild) now sets:

  • Context = each project's default publish output (bin/Release/net10.0/publish/)
  • Dockerfile = docker/Dockerfile.runtime (minimal runtime image)

This means docker compose build reads directly from each project's standard publish output directory. No custom output paths, no mapping script.

PostgreSQL Healthcheck

Added a Docker healthcheck to the PostgreSQL service using pg_isready. This enables depends_on with condition: service_healthy, so db-manager waits until PostgreSQL is actually accepting connections before starting. The db-manager also gets restart: on-failure as a safety net.

Consequences

Positive

  • Docker builds complete in seconds instead of 10+ minutes
  • No memory exhaustion during Docker builds -- COMPOSE_PARALLEL_LIMIT is no longer needed
  • No custom build script needed -- standard dotnet/docker commands
  • Runtime images are ~200MB instead of 2GB+ (no SDK layer)
  • MSBuild incremental compilation means rebuilding a single engine after a code change is fast
  • PostgreSQL healthcheck prevents db-manager startup failures

Negative

  • Requires .NET SDK installed on the host machine (already the case for development)
  • dotnet publish must be run before docker compose build (two commands instead of one)
  • Dockerfile.engine is retained as a fallback for CI/CD environments where host-build isn't practical

Neutral

  • UI Dockerfile unchanged (Node.js build stays in Docker -- single build, not affected)
  • Superset Dockerfile unchanged
  • All orchestration for Azure publish mode unchanged
  • All orchestration for Aspire run mode unchanged

Files Changed

  • Created: docker/Dockerfile.runtime -- minimal runtime-only Dockerfile
  • Modified: DynaplexEngineExtensions.cs -- ConfigureDockerComposeBuild uses default publish output context
  • Modified: DynaplexInfrastructureExtensions.cs -- same change for db-manager, plus postgres healthcheck and db-manager restart policy
  • Removed: docker/compose-build.sh -- no longer needed (standard commands suffice)
  • Modified: docker/compose-init.sh -- removed COMPOSE_PARALLEL_LIMIT (no longer needed)
  • Retained: docker/Dockerfile.engine -- kept as fallback for CI/CD