Documents/domain/Architecture Overview

Architecture Overview

Architecture Overview — High Level

Target architecture cho legacy .NET monolith → .NET 8 microservices
40K users | Zero downtime | 5 engineers | 9 months


1. Before & After

Before:

┌─────────────────────────────────────────┐
│           Legacy .NET Monolith           │
│                                         │
│  Travel │ Event │ Payment │ Workforce   │
│  Comms  │ Reporting                     │
│                                         │
│  ┌───────────────────────────────────┐  │
│  │       Single Database (200+ tables)│  │
│  └───────────────────────────────────┘  │
└─────────────────────────────────────────┘
  1 deploy = deploy everything
  1 failure = everything down
  1 team = bottleneck

After:

┌────────────────────────────────────────────────────────────────┐
│  Clients (React 18 + Legacy UI)                                │
└──────────────────────┬─────────────────────────────────────────┘
                       ▼
┌──────────────────────────────────────────────────────────────┐
│                    API Gateway (YARP)                          │
│         Strangler Fig routing: new services + legacy           │
└───┬──────┬──────┬──────┬──────┬──────┬────────────────────────┘
    ▼      ▼      ▼      ▼      ▼      ▼
┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌─────────────────────┐
│Travel││Event ││Work- ││Comms ││Report││ Legacy Monolith     │
│Svc   ││Svc   ││force ││Svc   ││Svc   ││ (Payment only)      │
│.NET 8││.NET 8││Svc   ││.NET 8││.NET 8││                     │
│      ││      ││.NET 8││      ││(CQRS)││ Accessed via ACL    │
└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──────────┬──────────┘
   │       │       │       │       │               │
┌──┴──┐ ┌──┴──┐ ┌──┴───┐ ┌┴────┐  │          ┌────┴─────┐
│ DB  │ │ DB  │ │  DB  │ │ DB  │  │          │Legacy DB │
└─────┘ └─────┘ └──────┘ └─────┘  │          └──────────┘
                                   │               │
                              ┌────┴───────────────┘
                              │ Reporting DB
                              │ (CDC read replicas)
                              └────────────────────
┌──────────────────────────────────────────────────────────────┐
│              Azure Service Bus (Event-Driven)                 │
│  booking.created │ event.updated │ staff.assigned │ notify   │
└──────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────┐
│              Observability (OpenTelemetry + Serilog)           │
│              AI Anomaly Detection                             │
└──────────────────────────────────────────────────────────────┘

2. Service Boundaries

Service Owns Key APIs Talks To
Travel Booking bookings, itineraries, suppliers /api/travel/* Payment (via ACL), Workforce (event)
Event Management events, venues, attendees /api/events/* Payment (via ACL), Comms (event)
Workforce staff, allocations, shifts /api/staff/* Travel & Event (event)
Communications notifications, templates /api/comms/* Subscribes to all domain events
Reporting (CQRS) read models, dashboards /api/reports/* Reads from all (via CDC + events)
Payment (legacy) payments, invoices /api/payments/* Stays in monolith. ACL bridge

3. Communication Model

                    SYNC (REST)                    ASYNC (Events)
                    ──────────                     ──────────────
Use when:           User needs immediate response  State changes, notifications
                    Payment processing             Cross-service data sync
                    Real-time queries              Fire-and-forget operations

Examples:           Client → API Gateway → Service Service → Event Bus → Subscribers
                    Travel → ACL → Legacy Payment  BookingCreated → Comms → Send email
                    Any Service → Workforce query   EventUpdated → Reporting → Update view

3 simple rules:

  1. Client calls = always sync (REST via API Gateway)
  2. State changes = publish event (async via Service Bus)
  3. Cross-service write = never direct DB access, always API or event

4. Data Strategy

Rule: Each service owns its database. No sharing.

┌──────────────────────────────────────────────────────┐
│  How services get data they don't own:               │
│                                                      │
│  Need real-time data? → Call the owning service API  │
│  Need historical data? → Subscribe to events         │
│  Need reports across all? → Reporting service (CDC)  │
│  Need payment? → Call legacy via ACL                 │
└──────────────────────────────────────────────────────┘

5. Migration Pattern — Strangler Fig

Phase 0: API Gateway set up, all traffic → Legacy
         ┌──────────┐
         │ 100%     │──► Legacy Monolith
         │ traffic  │
         └──────────┘

Phase 1: Core services go-live (Month 2–4)
         ┌──────────┐
         │ /travel  │──► Travel Service (NEW)
         │ /events  │──► Event Service (NEW)
         │ /*       │──► Legacy (everything else)
         └──────────┘

Phase 2: Scale out (Month 5–7)
         ┌──────────┐
         │ /travel  │──► Travel Service
         │ /events  │──► Event Service
         │ /staff   │──► Workforce Service (NEW)
         │ /comms   │──► Comms Service (NEW)
         │ /reports │──► Reporting Service (NEW)
         │ /payment │──► Legacy (ACL)
         └──────────┘

Phase 3: Hardening (Month 8–9), only Payment remains in legacy
         Everything else = new .NET 8 services
         Rollback = change YARP route (< 5 minutes)

6. Tech Stack Summary

Layer Choice Why
Backend .NET 8 microservices Team expertise, performance, LTS
Frontend React 18 (incremental) Modern, component-based, AI-assisted generation
API Gateway YARP .NET-native, Strangler Fig routing, lightweight
Messaging Azure Service Bus Enterprise-grade, dead letter support, Azure-native
Database Azure SQL (per-service) Managed, familiar, per-service isolation
Data Sync CDC (Change Data Capture) Non-invasive legacy sync, real-time
CI/CD GitHub Actions Integrated with repo, good Azure support
IaC Bicep Azure-native, simpler than ARM templates
Containers Azure Container Apps Serverless containers, auto-scale, managed
Observability OpenTelemetry + Serilog Distributed tracing + structured logging
Testing Pact (contract tests) Verify service boundaries without full E2E
AI Tooling Cursor + Claude Code + CodeRabbit 2x engineering multiplier

7. Key Architecture Decisions

Decision Chose Over Because
Migration pattern Strangler Fig Big bang rewrite Zero downtime required
API Gateway YARP Azure API Mgmt .NET team, lighter, cheaper
Payment approach ACL to legacy Rewrite Payment Constraint: frozen Phase 1
Database Per-service (phased) Shared DB Service autonomy, independent deploy
Messaging Azure Service Bus RabbitMQ Managed, enterprise SLA, Azure-native
Frontend Incremental React Full rewrite 5 engineers can't rewrite all UI
Reporting CQRS read models Direct DB queries Cross-module reports need aggregated views
Service-to-service Events (async) Direct REST calls Loose coupling, resilience

8. Security (High Level)

Edge:        TLS + JWT auth + Rate limiting (at API Gateway)
Service:     mTLS between services + Managed identities
Data:        Encryption at rest + Secrets in Key Vault
Pipeline:    SAST scan + Container scan + Dependency audit
Payment:     Stays in legacy (proven security) + ACL isolates

9. What This Architecture Enables

✅ Zero downtime migration     — Strangler Fig, rollback in minutes
✅ Independent deployment      — Deploy Travel without touching Event
✅ Independent scaling         — Scale Travel during peak, not everything
✅ Team parallelization        — 2 engineers on 2 services simultaneously
✅ Payment safety              — Frozen, untouched, accessed only via ACL
✅ AI-ready data foundation    — Events captured → future ML/analytics
✅ Incremental value delivery  — Each service go-live = immediate value