Analysis v1
Analysis — Legacy Platform Modernization
0. AI-First Strategy — The Multiplier
0.1 Why AI-First Is Not Optional Here
PhoenixDX defines itself as an "AI-first engineering hub" with explicit goals:
- Pioneer AI-augmented engineering practices
- Embed AI deeply into operational workflows
- AI-ready architecture for the next decade
→ If the solution doesn't demonstrate AI-first approach in both process and product, it misses the core signal of the brief.
AI-first here has 2 dimensions:
- AI for Building — Using AI to accelerate the migration process (process)
- AI in Product — The new system architecture must be AI-ready (product)
0.2 Month 0: AI Engineering Foundation
Dedicate Month 1 (running in parallel with infra setup) to establish AI engineering practices for the team:
Week 1-2: AI Toolchain Setup
├── Coding: GitHub Copilot / Cursor for the entire team
├── Code Review: AI-assisted review (CodeRabbit / Copilot PR Review)
├── Testing: AI-generated test cases (Copilot + custom prompts)
├── Documentation: AI-generated ADRs, API docs from code
└── Legacy Analysis: AI-powered codebase understanding
Week 3-4: AI Workflow Integration
├── Prompt Library: Create shared prompt templates for the team
│ ├── "Analyze this legacy module and identify bounded context"
│ ├── "Generate .NET 8 service from this legacy code"
│ ├── "Write contract tests for this API migration"
│ └── "Generate CDC migration script for this table"
├── AI Code Review Gates: Setup rules for AI-assisted PR review
├── Knowledge Base: Feed legacy codebase into AI context
└── Metrics: Track AI adoption rate, time-saved per task
0.3 AI Multiplier Effect — Capacity Recalculation
With AI tooling, engineering capacity changes significantly:
Without AI (Traditional):
Total capacity: 5 engineers × 9 months = 45 engineer-months
Subtract overhead: = -18 engineer-months
Available for feature work: = 27 engineer-months
With AI-First (Adjusted):
Total capacity: 5 engineers × 9 months = 45 engineer-months
Subtract overhead: = -18 engineer-months
Base available: = 27 engineer-months
AI Setup investment (Month 1): = -3 engineer-months
AI Productivity multiplier (1.4x on remaining): = +9.6 engineer-months
─────────────────────────────────────────────────────────────────────────
Effective capacity: ≈ 33.6 engineer-months
Multiplier 1.4x explained:
- Boilerplate/CRUD generation: ~3x faster → but only accounts for 30% of work
- Test writing: ~2x faster → accounts for 20% of work
- Code review + bug finding: ~1.5x faster → accounts for 15% of work
- Complex logic/architecture: ~1.1x (AI provides limited help) → accounts for 35% of work
- Weighted average: ~1.4x overall
→ +6.6 effective engineer-months compared to a non-AI approach. Enough to add one more module or provide buffer for stabilization.
0.4 AI Application Per Migration Phase
| Phase | AI Application | Expected Impact |
|---|---|---|
| Phase 1: Foundation | AI analyze legacy codebase → auto-map dependencies, identify bounded contexts. AI generate IaC templates, CI/CD pipelines | Save ~2 weeks manual analysis |
| Phase 2: Extraction | AI translate legacy .NET code → .NET 8. AI generate contract tests. AI write data migration scripts | 30-40% faster per service extraction |
| Phase 3: Event/Report | AI generate event schemas from legacy workflows. AI build CQRS read models from existing SQL queries | Save ~3 weeks boilerplate |
| Phase 4: Stabilize | AI-powered monitoring anomaly detection. AI generate load test scenarios from production patterns | Faster issue detection |
0.5 AI in Product Architecture (AI-Ready Foundation)
The new architecture must be ready for AI features in the future:
┌─────────────────────────────────────────────────┐
│ AI-Ready Data Layer │
├─────────────┬──────────────┬────────────────────┤
│ Event Store │ Feature Store│ Vector Store │
│ (all domain │ (ML-ready │ (future: semantic │
│ events) │ aggregates) │ search, RAG) │
├─────────────┴──────────────┴────────────────────┤
│ Unified Event Bus (Azure SB) │
│ Every domain event is captured → AI trainable │
└─────────────────────────────────────────────────┘
Specifically:
- Event-driven architecture → Every business event is captured → data for AI/ML later
- Per-service databases → Clean data boundaries → easy to build feature stores
- API Gateway → Central point to inject AI (rate limiting, anomaly detection, smart routing)
- Structured logging + observability → AI-powered monitoring from day 1
This doesn't add significant effort since event-driven and observability are already in the plan. We just need to design event schemas with AI consumption in mind.
0.6 Preventing Blind AI Usage (Team Governance)
This is also a question in the deliverable. Strategy:
| Layer | Practice |
|---|---|
| Code Generation | AI output must pass CI pipeline (lint, test, security scan) — no exceptions |
| Architecture Decisions | AI can draft ADRs, but must have human review + sign-off from Tech Lead |
| Code Review | AI review is first pass, human review is the final gate |
| Testing | AI-generated tests must cover business requirements (traced to user stories), not just code coverage |
| Security | AI-generated code runs through SAST/DAST. Payment-related code requires mandatory manual review |
| Knowledge | Team must understand AI-written code — weekly random "explain this code" sessions |
1. Domain Decomposition Analysis
1.1 Identified Bounded Contexts
From the legacy monolith, we identify 6 potential bounded contexts:
| # | Bounded Context | Core Responsibility | Domain Complexity |
|---|---|---|---|
| 1 | Travel Booking | Search, booking, itinerary, supplier integration | High |
| 2 | Event Management | Event creation, scheduling, venue, attendee mgmt | High |
| 3 | Payment & Billing | Payment processing, invoicing, reconciliation | Critical |
| 4 | Workforce Management | Staff allocation, scheduling, availability | Medium |
| 5 | Communications | Notifications, emails, in-app messaging | Low |
| 6 | Reporting & Analytics | Operational reports, dashboards, data export | Medium |
1.2 Domain Relationship Map
Travel Booking ──────► Payment & Billing ◄────── Event Management
│ ▲ │
│ │ │
▼ │ ▼
Workforce Mgmt ───────────────┘ Communications
│ ▲
└─────────────► Reporting & Analytics ────────────┘
1.3 Coupling Analysis
| Relationship | Coupling Level | Note |
|---|---|---|
| Travel → Payment | Tight | Every booking triggers payment. Phase 1 payment freeze → must use Anti-Corruption Layer |
| Event → Payment | Tight | Event registration also requires payment |
| Travel → Workforce | Medium | Staff allocation for travel operations |
| Event → Communications | Medium | Event notifications, reminders |
| All → Reporting | Loose | Reporting reads data, doesn't write. Easiest to extract |
| Travel ↔ Event | Ambiguous | Potentially shared concepts (venue, date, attendees). Boundary needs clarification |
Key Insight: Payment is the central coupling point. Freezing payment in Phase 1 is actually an advantage — we can extract other modules without touching the highest-risk component.
2. Constraint Deep-Dive
2.1 "Zero Downtime" — What It Actually Means
Not just "don't turn off the server". It encompasses:
- No service interruption for 40K users across multiple time zones (= basically 24/7)
- No data loss during the migration process
- No feature regression — users must retain all existing functionality
- No breaking changes — API consumers and integrations must continue working
Implication: Must use Strangler Fig Pattern — run legacy + new in parallel, gradually route traffic. A "big bang" cutover is not an option.
2.2 "5 Engineers, 9 Months" — Capacity Analysis (AI-Adjusted)
Traditional calculation:
Total capacity: 5 engineers × 9 months = 45 engineer-months
Subtract: Ramp-up/onboarding ~3 engineer-months
CI/CD + IaC foundation ~4 engineer-months
Testing + stabilization ~6 engineer-months
Meetings/overhead (15%) ~5 engineer-months
─────────────────────────────────────────────────────────
Available for feature work: ~27 engineer-months
With AI-first multiplier (see Section 0.3):
Base available: 27 engineer-months
AI setup investment: -3 engineer-months
AI productivity gain (1.4x): +9.6 engineer-months
─────────────────────────────────────────────────────────
Effective capacity: ~33.6 engineer-months
→ +6.6 engineer-months = roughly enough for 1 additional service extraction or buffer for quality + stabilization.
Implication:
- AI-first investment in Month 1 is a cost upfront with compounding payoff — every subsequent month the team moves faster
- Can modernize 3-4 modules instead of only 2-3
- Still can't do everything → prioritization still needed, but there's more room
- AI is especially effective for repetitive work: CRUD services, test generation, data migration scripts
2.3 "Payment Flow Cannot Change in Phase 1"
There are 2 possible interpretations:
- Interpretation A: Payment code stays as-is in the monolith, no refactoring → new services call into the monolith for payment
- Interpretation B: Payment API/UX stays the same, but internals can be refactored → higher risk
Recommendation: Go with Interpretation A (safer). Payment module lives in the monolith throughout Phase 1. Extract in Phase 2+ once confidence is established.
3. Risk & Feasibility Matrix
3.1 Feasibility Assessment
| Deliverable | Feasibility with 5 eng / 9 months | Reasoning |
|---|---|---|
| Extract Travel Booking | ✅ Feasible | Core domain, high value, well-defined boundary |
| Extract Event Management | ✅ Feasible | But must come after Travel or in parallel toward the end |
| Extract Payment | ⚠️ Risky | Frozen Phase 1 + complexity → defer to Phase 3+ |
| Extract Workforce | ⚠️ Partial | Can extract logic, keep DB shared temporarily |
| Extract Communications | ✅ Easy | Low coupling, can do early as a quick win |
| Extract Reporting | ✅ Easy | Read-only, use CQRS pattern, separate read DB |
| React 18 Frontend | ⚠️ Partial | Not enough capacity to rewrite entire UI in 9 months |
| CI/CD + IaC | ✅ Must-have | Foundation, must complete in Phase 1 |
| Event-driven (Service Bus) | ✅ Feasible | Incremental adoption, doesn't need all-at-once |
3.2 What's Realistically Achievable in 9 Months (AI-Adjusted)
✅ CAN DO (with AI multiplier):
- AI engineering foundation (Month 1)
- CI/CD + IaC foundation
- API Gateway + Strangler Fig routing
- 3-4 services extracted (Communications, Travel, Event, Reporting-read)
- React 18 for 2-3 key modules (AI-assisted component generation)
- Event-driven messaging for new services
- Observability foundation + AI-powered monitoring
- AI-ready event schema design
❌ CANNOT DO (defer):
- Full payment modernization
- Complete database decomposition for all services
- Full React 18 rewrite of ALL modules
- ML/AI features in product (foundation only)
- Performance optimization at scale
Difference vs non-AI approach: +1 service extraction, +1 React module, AI-ready data foundation laid
4. Migration Pattern Analysis
4.1 Pattern Comparison
| Pattern | Fit? | Reasoning |
|---|---|---|
| Strangler Fig | ✅ Best fit | Incremental, zero-downtime compatible, proven for monolith→microservices |
| Big Bang Rewrite | ❌ No | Zero downtime requirement eliminates this |
| Branch by Abstraction | ⚠️ Partial | Good for internal refactoring, but insufficient for full extraction |
| Parallel Run | ✅ Complement | Use in combination with Strangler Fig for high-risk modules |
| Blue-Green Deployment | ✅ Complement | For deployment strategy, not migration strategy |
4.2 Recommended: Strangler Fig + Anti-Corruption Layer
┌──────────────┐
Users ────────►│ API Gateway │
│ (Route Layer) │
└──────┬───────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ New │ │ New │ │ Legacy │
│ Travel │ │ Comms │ │ Monolith │
│ Service │ │ Service │ │ (Payment,│
│ (.NET 8) │ │ (.NET 8) │ │ Event, │
└────┬─────┘ └────┬─────┘ │ Report) │
│ │ └─────┬─────┘
▼ ▼ │
┌──────────┐ ┌──────────┐ │
│ Travel │ │ Comms │ │
│ DB │ │ DB │ │
└──────────┘ └──────────┘ ▼
┌──────────┐
│ Monolith │
│ DB │
└──────────┘
Anti-Corruption Layer: Placed between new services and the legacy monolith. When the new Travel Service needs payment → calls through ACL → ACL translates to the legacy payment API. When payment is modernized later → only the ACL changes, not the Travel Service.
5. Phase Prioritization Analysis
5.1 Scoring Matrix
| Module | Business Value | Extraction Difficulty | Risk if Delayed | Coupling to Payment | Priority Score |
|---|---|---|---|---|---|
| CI/CD + Infra | — | Medium | Blocks everything | None | P0 |
| Communications | Medium | Low | Low | None | P1 (quick win) |
| Travel Booking | High | High | High | High (via ACL) | P1 |
| Event Mgmt | High | High | Medium | High (via ACL) | P2 |
| Reporting | Medium | Low | Low | None | P2 |
| Workforce | Medium | Medium | Low | Low | P3 |
| Payment | Critical | Critical | Frozen Phase 1 | N/A | P3+ |
5.2 Recommended Phase Sequence (AI-First)
Month: 1 2 3 4 5 6 7 8 9
├─────┼─────┼─────┼─────┼─────┼─────┼─────┼─────┤
Phase 0 │█████│ AI Foundation
(M1) │ AI toolchain setup, prompt library, (parallel with
│ legacy codebase AI analysis, Phase 1 infra)
│ team AI workflow onboarding
│ │
Phase 1 │█████████████│ Infra Foundation
(M1-2)│ CI/CD, IaC, API Gateway, + Strangler Fig
│ Observability, Strangler Fig + AI monitoring
│ AI-powered monitoring setup
│ │
Phase 2 │ │██████████████████│ First Extractions
(M2-4)│ │ Communications (AI quick win) Travel Booking
│ │ Travel Booking service + React pages
│ │ React 18 for Travel pages
│ │ AI-generated contract tests
│ │
Phase 3 │ │ │██████████████████│ Domain Expansion
(M4-7)│ │ │ Event Management │ + Reporting CQRS
│ │ │ Reporting (CQRS) │ + AI-ready events
│ │ │ AI-ready event schema│
│ │
Phase 4 │ │██████████████████│ Harden + Plan
(M7-9)│ │ Stabilization │ Payment strategy
│ │ Performance │ AI feature backlog
│ │ Payment planning │
│ │ AI feature roadmap│
Phase 0 (AI Foundation) runs in parallel with Phase 1 infra setup — same Month 1, no additional calendar time. But every subsequent phase moves faster thanks to AI tooling already being in place.
6. Key Technical Decisions Needed
| # | Decision | Options | Recommendation | Reasoning |
|---|---|---|---|---|
| 1 | Database strategy | Shared DB → Per-service DB | Phased: shared DB view first, then split | Zero downtime + 5 engineers = can't split all DBs at once |
| 2 | API Gateway | Azure API Mgmt / YARP / Ocelot | YARP (.NET-based reverse proxy) | .NET team, lightweight, supports Strangler Fig routing |
| 3 | Service communication | Sync (REST/gRPC) / Async (Service Bus) | Both: REST for queries, Service Bus for events | Event-driven where possible, sync for real-time needs |
| 4 | Frontend strategy | Full React rewrite / Micro-frontends / Incremental | Incremental: React for new pages, legacy UI for unchanged pages | Can't rewrite all UI with 5 engineers |
| 5 | Data migration | ETL / CDC / Dual-write | CDC (Change Data Capture) | Real-time sync without modifying legacy code |
| 6 | Testing strategy | E2E first / Contract first | Contract testing (Pact) | Ensures backward compatibility between services |
| 7 | AI tooling | Copilot / Cursor / CodeRabbit / Custom | Copilot + CodeRabbit + custom prompts | Enterprise-ready, team-scalable, auditable |
| 8 | AI governance | Free-for-all / Strict gates / Balanced | Balanced: AI first-pass, human final gate | Productivity + quality, especially for payment code |
7. What the Assessors Are Really Looking For
Reading the brief carefully, assessors evaluate:
| Criteria | What They Want to See | What They DON'T Want |
|---|---|---|
| Technical Proficiency | Deep understanding of patterns (Strangler Fig, CQRS, ACL), knowing when to use what | Buzzword dumping, listing tech without explaining why |
| Analytical Skills | Clear trade-off reasoning, logically justified priorities | "Extract all 6 services in 9 months" — unrealistic |
| Attention to Detail | Constraints addressed specifically (zero downtime HOW, payment freeze HOW) | Generic migration plan that doesn't mention constraints |
| AI-first Mindset | Smart AI usage + honest declaration | Copy-pasting AI output without validation |
| Leadership Judgment | Knowing when to say "NO" — what NOT to do in 9 months | Over-promising, lacking trade-offs |
Hidden Signal: The Brief Tests Judgment, Not Knowledge
Anyone can Google "microservices migration patterns". What assessors want to see is:
- With 5 people and 9 months, what do you sacrifice?
- With zero downtime, how do you handle data consistency?
- With payment frozen, how do you decouple other modules?
→ Answers must be specific to these constraints, not generic best practices.