Documents/planning/Deployment Strategy

Deployment Strategy

Deployment Guide

Deployment strategy, environments, CI/CD pipelines, rollback procedures
For: .NET 8 microservices + React 18 frontend + Legacy monolith coexistence


1. Environment Strategy

┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
│   Local  │────►│   Dev    │────►│ Staging  │────►│Production│
│          │     │          │     │          │     │          │
│ Docker   │     │ Shared   │     │ Prod     │     │ Live     │
│ Compose  │     │ testing  │     │ mirror   │     │ 40K users│
│ per eng  │     │ auto-    │     │ manual   │     │ canary   │
│          │     │ deploy   │     │ approval │     │ release  │
└──────────┘     └──────────┘     └──────────┘     └──────────┘
Environment Purpose Deploy Trigger Data Who Uses
Local Development + debugging Manual (docker-compose up) Seed data / local DB Engineers
Dev Integration testing, auto-deploy Every merge to main Synthetic test data Engineers
Staging Pre-production validation Manual promotion from Dev Anonymized prod snapshot Engineers + QA + PO
Production Live users Manual approval from Staging Real data, 40K users End users

2. CI/CD Pipeline

2.1 Pipeline Overview

┌─────────────────────────────────────────────────────────────────────┐
│                    GITHUB ACTIONS PIPELINE                           │
│                                                                     │
│  PR Created / Push to feature branch:                               │
│  ┌─────┐ ┌──────┐ ┌─────────┐ ┌──────┐ ┌──────────┐              │
│  │Build│►│ Unit │►│Contract │►│ SAST │►│ Docker   │              │
│  │     │ │ Test │ │  Test   │ │ Scan │ │ Build    │              │
│  └─────┘ └──────┘ └─────────┘ └──────┘ └──────────┘              │
│                                                                     │
│  Merge to main:                                                     │
│  ┌─────┐ ┌──────┐ ┌─────────┐ ┌──────┐ ┌──────┐ ┌─────────────┐ │
│  │Build│►│ All  │►│ Docker  │►│ Push │►│Deploy│►│ Integration │ │
│  │     │ │Tests │ │ Build   │ │ ACR  │ │ Dev  │ │ Tests (Dev) │ │
│  └─────┘ └──────┘ └─────────┘ └──────┘ └──────┘ └─────────────┘ │
│                                                                     │
│  Promote to Staging:                                                │
│  ┌───────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────────────┐ │
│  │ Manual    │►│ Deploy   │►│ Smoke Tests  │►│ Load Test        │ │
│  │ Trigger   │ │ Staging  │ │ (automated)  │ │ (optional, phase)│ │
│  └───────────┘ └──────────┘ └──────────────┘ └──────────────────┘ │
│                                                                     │
│  Promote to Production:                                             │
│  ┌───────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────────────┐ │
│  │ Approval  │►│ Canary   │►│ Monitor      │►│ Full Rollout     │ │
│  │ (TL/Sr)   │ │ 5%       │ │ 30 min       │ │ or Rollback      │ │
│  └───────────┘ └──────────┘ └──────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

2.2 Pipeline Definition (GitHub Actions)

# .github/workflows/service-ci.yml (per service)
name: Service CI/CD

on:
  push:
    branches: [main]
    paths: ['src/services/<service-name>/**']
  pull_request:
    paths: ['src/services/<service-name>/**']

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-dotnet@v4
        with: { dotnet-version: '8.0.x' }

      - name: Restore & Build
        run: dotnet build src/services/<service-name>

      - name: Unit Tests
        run: dotnet test src/services/<service-name>/tests/UnitTests

      - name: Contract Tests (Pact)
        run: dotnet test src/services/<service-name>/tests/ContractTests

      - name: SAST Scan
        uses: github/codeql-action/analyze@v3

  docker:
    needs: build-and-test
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - name: Build & Push Docker Image
        run: |
          docker build -t $ACR/<service-name>:$SHA .
          docker push $ACR/<service-name>:$SHA

  deploy-dev:
    needs: docker
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Dev
        run: az containerapp update --name <service-name>
             --resource-group dev --image $ACR/<service-name>:$SHA

      - name: Integration Tests
        run: dotnet test tests/IntegrationTests
             --filter "Environment=Dev"

  deploy-staging:
    needs: deploy-dev
    environment: staging  # requires manual approval
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Staging
        run: az containerapp update --name <service-name>
             --resource-group staging --image $ACR/<service-name>:$SHA

      - name: Smoke Tests
        run: ./scripts/smoke-test.sh staging

  deploy-production:
    needs: deploy-staging
    environment: production  # requires TL/Senior approval
    runs-on: ubuntu-latest
    steps:
      - name: Canary Deploy (5%)
        run: az containerapp revision set-traffic
             --name <service-name> --resource-group prod
             --revision-weight latest=5 stable=95

      - name: Monitor (wait 30 min)
        run: ./scripts/monitor-canary.sh 30

      - name: Full Rollout
        run: az containerapp revision set-traffic
             --name <service-name> --resource-group prod
             --revision-weight latest=100

2.3 React Frontend Pipeline

# .github/workflows/frontend-ci.yml
name: Frontend CI/CD

on:
  push:
    branches: [main]
    paths: ['src/frontend/**']

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/setup-node@v4
        with: { node-version: '20' }

      - run: npm ci
      - run: npm run lint
      - run: npm run test
      - run: npm run build

      - name: Deploy to Azure CDN (Staging)
        run: az storage blob upload-batch
             --destination '$web' --source dist/

      - name: Playwright E2E (critical paths)
        run: npx playwright test --project=chromium

3. Container Strategy

3.1 Dockerfile Template (per service)

# Multi-stage build
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
WORKDIR /src
COPY ["ServiceName.API/ServiceName.API.csproj", "ServiceName.API/"]
COPY ["ServiceName.Application/ServiceName.Application.csproj", "ServiceName.Application/"]
COPY ["ServiceName.Domain/ServiceName.Domain.csproj", "ServiceName.Domain/"]
COPY ["ServiceName.Infrastructure/ServiceName.Infrastructure.csproj", "ServiceName.Infrastructure/"]
RUN dotnet restore "ServiceName.API/ServiceName.API.csproj"
COPY . .
RUN dotnet publish "ServiceName.API/ServiceName.API.csproj" \
    -c Release -o /app/publish --no-restore

FROM mcr.microsoft.com/dotnet/aspnet:8.0 AS runtime
WORKDIR /app
COPY --from=build /app/publish .

# Security: non-root user
RUN adduser --disabled-password --gecos "" appuser
USER appuser

EXPOSE 8080
ENV ASPNETCORE_URLS=http://+:8080
ENTRYPOINT ["dotnet", "ServiceName.API.dll"]

3.2 Local Docker Compose

# docker-compose.yml (local development)
services:
  travel-service:
    build: ./src/services/travel-booking
    ports: ["5001:8080"]
    environment:
      - ConnectionStrings__Default=Server=sqlserver;Database=TravelDb;...
      - ServiceBus__ConnectionString=...
    depends_on: [sqlserver, servicebus]

  event-service:
    build: ./src/services/event-management
    ports: ["5002:8080"]
    depends_on: [sqlserver, servicebus]

  comms-service:
    build: ./src/services/communications
    ports: ["5003:8080"]
    depends_on: [sqlserver, servicebus]

  workforce-service:
    build: ./src/services/workforce
    ports: ["5004:8080"]
    depends_on: [sqlserver, servicebus]

  reporting-service:
    build: ./src/services/reporting
    ports: ["5005:8080"]
    depends_on: [sqlserver, servicebus]

  api-gateway:
    build: ./src/gateway
    ports: ["5000:8080"]
    depends_on:
      - travel-service
      - event-service
      - comms-service

  sqlserver:
    image: mcr.microsoft.com/mssql/server:2022-latest
    ports: ["1433:1433"]
    environment:
      - ACCEPT_EULA=Y
      - SA_PASSWORD=DevPassword123!

  servicebus-emulator:
    image: mcr.microsoft.com/azure-messaging/servicebus-emulator:latest
    ports: ["5672:5672"]

  seq:
    image: datalust/seq:latest
    ports: ["5341:80"]
    environment:
      - ACCEPT_EULA=Y

4. Azure Infrastructure (IaC)

4.1 Resource Topology

Azure Subscription
├── Resource Group: rg-product-dev
│   ├── Container Apps Environment (dev)
│   │   ├── travel-service
│   │   ├── event-service
│   │   ├── workforce-service
│   │   ├── comms-service
│   │   ├── reporting-service
│   │   └── api-gateway
│   ├── Azure SQL Server (dev)
│   │   ├── travel-db
│   │   ├── event-db
│   │   ├── workforce-db
│   │   ├── comms-db
│   │   └── reporting-db
│   ├── Service Bus Namespace (dev)
│   └── Key Vault (dev)
│
├── Resource Group: rg-product-staging
│   └── (same structure, prod-like config)
│
├── Resource Group: rg-product-prod
│   └── (same structure, production scale)
│
└── Shared Resources
    ├── Azure Container Registry (ACR)
    ├── Azure Monitor / App Insights
    ├── Azure CDN (React frontend)
    └── Azure Front Door (global load balancing)

4.2 Bicep Example (Service Deploy)

// infra/modules/container-app.bicep
param serviceName string
param imageTag string
param environmentId string

resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
  name: serviceName
  location: resourceGroup().location
  properties: {
    environmentId: environmentId
    configuration: {
      ingress: {
        external: false  // internal only, gateway handles external
        targetPort: 8080
        transport: 'http'
      }
      secrets: [
        { name: 'db-connection', keyVaultUrl: '...' }
        { name: 'sb-connection', keyVaultUrl: '...' }
      ]
    }
    template: {
      containers: [
        {
          name: serviceName
          image: 'acr.azurecr.io/${serviceName}:${imageTag}'
          resources: {
            cpu: json('0.5')
            memory: '1Gi'
          }
        }
      ]
      scale: {
        minReplicas: 1
        maxReplicas: 10
        rules: [
          {
            name: 'http-scaling'
            http: { metadata: { concurrentRequests: '100' } }
          }
        ]
      }
    }
  }
}

5. Deployment Types

5.1 Regular Service Deployment (2-3x per week)

Trigger: PR merged to main
Flow:    Build → Test → Docker → Dev (auto) → Staging (manual) → Prod (approval)
Risk:    Low (single service, rolling update)
Rollback: Point to previous container image (< 2 minutes)

5.2 Module Go-Live (Strangler Fig cutover)

Trigger: New service ready to replace legacy module
Flow:    
  Day 1:  Feature flag ON → 5% traffic to new service
  Day 2:  Monitor error rates, latency, business metrics
  Day 3:  25% traffic (if clean)
  Day 4:  50% traffic
  Day 5:  100% traffic
  Day 12: Legacy module decommission (after 1 week soak)

Risk:    Medium-High (full module cutover)
Rollback: YARP route change → back to legacy (< 5 minutes)

5.3 Database Migration

Trigger: Service moving from shared DB to per-service DB
Flow:
  Week 1:  CDC running → new DB syncing from legacy
  Week 2:  Verify data integrity (automated scripts)
  Week 3:  Switch service to read from new DB (writes still to both)
  Week 4:  Switch writes to new DB only
  Week 5:  Decommission legacy DB tables (after soak)

Risk:    High (data migration)
Rollback: Switch connection string back to legacy DB

6. Rollback Procedures

6.1 Quick Reference

Scenario Rollback Method Time Who
Bad service deploy Revert to previous container revision < 2 min Any engineer
Bad module cutover YARP route change → legacy < 5 min Tech Lead / Senior
Bad DB migration Switch connection string → old DB < 10 min Tech Lead + DevOps
Bad frontend deploy Revert Azure CDN to previous build < 5 min Any engineer
Cascading failure Kill feature flag → all traffic to legacy < 1 min Anyone (kill switch)

6.2 Rollback Commands

# Rollback service to previous revision
az containerapp revision set-traffic \
  --name travel-service \
  --resource-group prod \
  --revision-weight stable=100 latest=0

# Rollback YARP route to legacy (module cutover)
# Update YARP config → redeploy gateway
az containerapp update \
  --name api-gateway \
  --resource-group prod \
  --set-env-vars "ROUTE_TRAVEL=legacy"

# Rollback frontend
az storage blob upload-batch \
  --destination '$web' \
  --source ./previous-build/ \
  --overwrite

# Emergency: kill all feature flags → everything to legacy
az appconfig feature set --name kill-switch --enabled true

7. Monitoring After Deploy

7.1 Post-Deploy Checklist

□ Health endpoints responding (/health, /ready)
□ Error rate < baseline (check Azure Monitor)
□ Latency p95 < baseline
□ No new exceptions in logs (Seq/App Insights)
□ Service Bus: no messages stuck in DLQ
□ Business metrics: bookings/events still flowing
□ AI anomaly detection: no alerts triggered

7.2 Alerting Rules

Metric Warning Critical Action
Error rate > 1% > 5% Auto-rollback at critical
Latency p95 > 500ms > 2s Page on-call
Health check 1 failure 3 consecutive Auto-restart container
DLQ messages > 10 > 100 Page on-call
CPU usage > 70% > 90% Auto-scale (already configured)
Memory usage > 75% > 90% Page on-call

8. Secrets Management

Rule: ZERO secrets in code, config files, or environment variables in plain text.

┌──────────────────────────────────────────────────────────┐
│                  SECRETS FLOW                             │
│                                                          │
│  Developer needs secret:                                 │
│    → Azure Key Vault (via az keyvault secret set)       │
│                                                          │
│  Service needs secret at runtime:                        │
│    → Container App → Managed Identity → Key Vault       │
│    → Automatic injection as environment variable         │
│                                                          │
│  CI/CD needs secret:                                     │
│    → GitHub Secrets (encrypted, per-environment)         │
│    → Injected into workflow at runtime                   │
│                                                          │
│  Local development:                                      │
│    → dotnet user-secrets (per developer, not in repo)   │
│    → Or .env file (gitignored)                          │
└──────────────────────────────────────────────────────────┘