Submission (≤6 Pages)
Technical Assessment — Legacy Platform Modernization
Role: Technical Lead — PhoenixDX Vietnam Hub Product A: Enterprise travel, event & operations platform | ~40,000 global users Scope: Legacy .NET monolith → .NET 8 microservices | 5 engineers | 9 months | Zero downtime Supporting docs: Full analysis repository available at the companion website
1. Target Architecture Overview
Clients (React 18 + Legacy UI)
│
┌───────────────┴───────────────┐
│ API Gateway (YARP) │
│ Strangler Fig weighted routing │
│ Auth │ Rate Limit │ Trace ID │
└──┬──────┬──────┬──────┬──────┬─┘
│ │ │ │ │
┌────┘ ┌──┘ ┌──┘ ┌──┘ ┌──┘
▼ ▼ ▼ ▼ ▼
Travel Event Work- Comms Reporting
Svc Svc force Svc Svc(CQRS)
.NET8 .NET8 Svc .NET8 .NET8
│ │ .NET8 │ │
[DB] [DB] │ [DB] │
[DB] ┌──────┘
│
┌────────────────────────┐ ┌────┴──────────┐
│ Legacy Monolith │ │ Reporting DB │
│ Payment (frozen Ph.1) │ │ (CDC replicas)│
│ ◄── ACL bridge ────── │ └───────────────┘
│ Monolith DB ──CDC──────┼──────────┘
└────────────────────────┘
┌────────────────────────────────────────────┐
│ Azure Service Bus (Event-Driven Backbone) │
└────────────────────────────────────────────┘
┌────────────────────────────────────────────┐
│ OpenTelemetry + Serilog │ Bicep (IaC) │
│ Azure Container Apps │ GitHub Actions CI │
└────────────────────────────────────────────┘
Service Boundaries — 6 bounded contexts, 1:1 mapping to DDD domains:
| Service | Owns | Communication |
|---|---|---|
| Travel Booking | bookings, itineraries, suppliers | Sync REST + Payment ACL + async events |
| Event Management | events, venues, attendees | Sync REST + Payment ACL + async events |
| Workforce | staff, allocations, shifts | Subscribes to travel/event events |
| Communications | notifications, templates | Subscribes to ALL domain events |
| Reporting (CQRS) | read models, dashboards | CDC from all DBs, event projections |
| Payment (legacy) | payments, invoices | Stays in monolith — ACL bridge only |
Communication rules: (1) Client → sync REST via YARP Gateway. (2) State changes → async event to Service Bus. (3) No cross-service direct DB access — ever.
2. Migration Strategy
4-Phase Timeline
M1 M2 M3 M4 M5 M6 M7 M8 M9
├────────┼────────┼────────┼────────┼────────┼────────┼────────┼────────┤
Phase 0 ██ AI Foundation + Infra, Comms pilot (staging)
Phase 1 ████████████████████ Travel(M3) + Event(M4) go-live
Phase 2 ████████████████████ Workforce(M6)
Comms+Report(M7)
Phase 3 ████████
Harden: perf, DR
| Phase | Duration | Key Deliverables | Go-Live |
|---|---|---|---|
| 0: AI Foundation | M1 | AI toolchain, CI/CD, IaC (Bicep), YARP Gateway, Comms pilot | — |
| 1: Core | M2–4 | Travel + Event extracted, Payment ACL, per-service DBs, CDC | Travel (M3), Event (M4) |
| 2: Scale | M5–7 | Workforce, Comms (prod), Reporting (CQRS), React 18 | Workforce (M6), Comms+Report (M7) |
| 3: Harden | M8–9 | Load testing (40K users), security audit, DR validation | All 5 hardened |
Capacity: 5 eng × 9 mo = 45 raw MM. Phase-by-phase with variable AI multiplier: P0(2.0, ×1.0) + P1(18.0, ×2.0) + P2(19.0, ×2.0) + P3(6.5, ×1.0) ≈ ~44 effective MM.
Zero-Downtime Strategy
Strangler Fig + YARP: Route traffic by URL path — migrate one module at a time, legacy serves unmigrated routes. Per-module cutover: Shadow mode (compare responses) → Canary (5%→25%→50%→100% over 7–11 days) → Full cutover. Auto-rollback if error rate > 0.5%. Rollback = YARP weight change (< 30 seconds).
Backward Compatibility
Both systems run simultaneously. New services call legacy Payment via ACL (adapter pattern). CDC keeps data in sync — no dual-write. Event schemas versioned (v1.0+). React shell loads new modules alongside legacy UI. When Payment is eventually modernized → swap ACL target, zero changes to consuming services.
3. Failure Modeling
| # | Scenario | L / I | Mitigation |
|---|---|---|---|
| F1 | CDC data inconsistency — new DB diverges from legacy during cutover | M / H | Checksum verification every 6h. Dual-read validation. Auto-pause on mismatch. 7-day parallel soak |
| F2 | Legacy Payment outage blocks new services via ACL | M / H | Circuit breaker (Polly): fail fast after 3 retries. Queue in Service Bus. Graceful degradation: "pending payment" |
| F3 | AI-generated code has business logic errors (e.g., pricing) | H / H | Human review mandatory for business logic. Contract tests (Pact) vs legacy. Shadow+Compare before traffic switch. Payment: zero AI-only merge |
| F4 | Cascading failure during canary — timeout causes retry storm | L / H | Auto-rollback (error > 0.5%). Bulkhead isolation. Rate limiting at Gateway. Kill switch → legacy in < 30s |
| F5 | Key engineer leaves mid-migration (bus factor) | M / M | Primary + secondary per service. AI-generated docs (Phase 0). Weekly walkthroughs. All decisions in ADRs |
4. Trade-Off Log
Intentionally not optimizing:
| Decision | Trade-Off | Revisit |
|---|---|---|
| Payment stays in monolith | ACL maintenance overhead | Post-9-months, when other services stable |
| Incremental React (3–4 modules) | Payment UI in iframe, UX gap | Month 10+, or hire frontend engineer |
| Container Apps over AKS | Less networking control | If services > 15 or team > 10 |
| Single region (active-passive) | ~120ms for EU/US users | If user growth justifies multi-region |
| Contract tests over heavy E2E | Some gaps found only in staging | Phase 3+ expand E2E |
Technical debt accepted: Legacy Payment iframe (low) · Hardcoded Comms templates (low) · Manual staging IaC (very low) · Limited load testing pre-Phase 3 (medium — mitigated by canary) · Event schema governance deferred to Phase 2 (medium — mitigated by Pact)
Revisit in 6 months: Payment migration · AI 2x multiplier accuracy · DB decomposition completion · React coverage · Multi-region evaluation · Event sourcing for high-value domains
5. Assumptions
| # | Assumption | Impact If Wrong |
|---|---|---|
| A1 | Team has senior .NET experience | Phase 0 extends 2–4 weeks |
| A2 | Legacy has some docs / discoverable APIs | AI analysis takes longer, missed business rules |
| A3 | Azure is the approved cloud provider | Architecture rework |
| A4 | "Payment frozen" = code in monolith, API still callable via ACL | If API frozen → bookings blocked |
| A5 | AI tools (Cursor Pro, Claude Code) purchasable | Multiplier drops 2x→1.2x, capacity ~32 MM |
| A6 | Legacy runs during full 9-month migration | If forced shutdown → scope shrinks dramatically |
| A7 | 40K users across timezones — no maintenance window | Strangler Fig mandatory |
| A8 | Team co-located or same timezone | Add async overhead (~10% capacity loss) |
| A9 | No regulatory requirements beyond standard security | Add 2–4 weeks compliance per service |
| A10 | Legacy database is SQL Server (CDC compatible) | Different CDC tooling needed |
Validation: A3–A6 validated Week 1 (stakeholders). A1–A2 validated Week 2 (pair programming + AI scan).
6. AI Usage Declaration
Tool: GitHub Copilot — VS Code Agent Mode (Claude Opus 4.6). Sole AI tool used throughout the entire assessment.
Workflow: Conversational agent collaboration — human provides direction and constraints, AI generates content, human reviews and corrects. All 33 documents, cross-referencing, consistency fixes, gap analysis, and website built through sequential Copilot conversations.
| Section | AI/Human | Human Contribution |
|---|---|---|
| Architecture (4.1) | 85/15 | Directed architecture decisions (Strangler Fig, YARP, per-service DB). Reviewed, corrected event flows |
| Migration timeline (4.2) | 80/20 | Set constraints (5 eng, 9 mo, 4 phases). Validated capacity math. Caught cross-doc inconsistency |
| Failure modeling (4.3) | 85/15 | Reviewed 8 scenarios, approved final 5, adjusted likelihood ratings |
| Trade-offs (4.4) | 75/25 | Engineering judgment on each decision (YARP over Ocelot, Container Apps over AKS) |
| Assumptions (4.5) | 80/20 | Identified key gaps from brief (payment frozen, tool procurement) |
| This declaration (4.6) | 90/10 | Directed "be honest about how we actually worked" |
Overall: ~85% AI (generation, code, analysis, formatting) / ~15% Human (direction, decisions, constraints, review, correction). That 15% is what matters — the Tech Lead decides what to build and when the AI is wrong.
Preventing Blind AI Usage
┌──────────┐ Payment: 2 human reviewers
│ SECURITY │ Zero AI-only merge
│ GATE │
└────┬─────┘
┌──────┴──────┐ Business logic: human
│ HUMAN │ validates requirement trace
│ REVIEW │
└──────┬──────┘
┌─────────┴─────────┐ CodeRabbit auto-review
│ AI AUTO-REVIEW │ flags issues for human
└─────────┬─────────┘
┌────────────┴────────────┐ Lint + tests + SAST +
│ CI PIPELINE GATE │ contract tests. Must pass.
│ No --no-verify │ No exceptions.
└─────────────────────────┘
Every AI-generated line passes ALL 4 gates. Weekly "explain this code" sessions — team must understand what AI wrote. Prompt library versioned in git.