Deliverable 4.4 — Trade-Off Log
Requirement: What are you intentionally not optimizing? What technical debt? What to revisit in 6 months?
Source: Submission.md, Constraints Analysis.md, Tech Stack Analysis.md, Strategy.md
1. Intentionally Not Optimizing
| # |
What We Chose |
Trade-Off Accepted |
Over (Alternative) |
Why This Trade-Off |
| T1 |
Payment stays in monolith |
Tech debt: ACL adapter, legacy maintenance |
Migrate Payment early |
Constraint (frozen Phase 1) + highest risk (PCI, financial). ACL provides clean bridge. Cost of delay: ACL maintenance ~0.5 MM/month |
| T2 |
Azure Container Apps |
Less control over networking, pod config |
Kubernetes (AKS) |
5 engineers can't manage K8s cluster. Container Apps = managed, auto-scale, zero ops. Cost: less fine-grained control when debugging network issues |
| T3 |
Azure SQL everywhere |
Not polyglot-optimized per service |
Cosmos DB for Comms, Redis for cache, etc. |
One DB technology = one skill to maintain. Polyglot = multiple operational burdens for 5 eng. Cost: Comms messages could be faster with document store |
| T4 |
Incremental React rewrite |
Legacy Payment UI in iframe. UX inconsistency on 1-2 modules |
Full React rewrite for all modules |
5 engineers can't rewrite all frontend + all backend simultaneously. Cost: Payment page looks "old" next to new modules |
| T5 |
Single region (Active-Passive) |
Higher latency for EU/US users (~120-200ms) |
Multi-region active-active |
40K users served well with SEA primary + CDN. Active-active = double infra complexity. Cost: ~120ms extra for US users (acceptable for enterprise app) |
| T6 |
Contract tests over heavy E2E |
Fewer full-flow automated tests |
Comprehensive E2E test suite (Playwright) |
E2E tests = slow, flaky, high maintenance. Contract tests verify boundaries efficiently. Cost: some integration gaps only caught in staging/canary |
| T7 |
Shared DB views during CDC transition |
Temporary coupling via shared DB views |
Full data decomposition from Day 1 |
Per-service DBs come at each service's go-live, but during transition CDC bridges the gap. Cost: short coupling window per module |
2. Technical Debt Accepted
| # |
Debt |
Why Accepted |
Severity |
Cost of Delay |
| D1 |
Legacy Payment UI in iframe (no React rewrite) |
Works, zero risk, user doesn't notice difference for basic payment flows |
Low |
UX inconsistency. Fix when Payment modernized |
| D2 |
Comms templates hardcoded initially |
Quick migration. Template engine comes Phase 2 complete |
Low |
Duplicate templates, manual updates. ~1 week to fix later |
| D3 |
Manual IaC for some staging resources |
Automation ROI not worth for one-time staging setup |
Very Low |
Manual drift possible, but staging only |
| D4 |
Limited load testing before Phase 3 |
Load testing meaningful only when services are stable and connected |
Medium |
Performance issues discovered later. Mitigated by canary release (gradual traffic) |
| D5 |
Reporting queries not fully optimized |
Materialized views from events = good enough, not optimal |
Medium |
Slow queries for complex reports. Fix with production query patterns |
| D6 |
No service mesh (Istio/Linkerd) |
5 services + YARP + Polly handles current needs |
Low |
If services grow > 15, inter-service communication gets complex. Add mesh then |
| D7 |
Event schema not fully governed |
Schema Registry setup Phase 0, but enforcement only from Phase 2 |
Medium |
Schema drift between services. Mitigated by contract tests (Pact) |
Debt Priority Matrix
HIGH COST LOW COST
TO FIX LATER TO FIX LATER
────────── ─────────────
CAUSES ISSUES │ D5: Report queries │ D2: Templates
SOON │ D7: Schema govern. │ D3: Manual IaC
│ │
NO IMMEDIATE │ D1: Payment iframe │ D6: No service mesh
ISSUES │ D4: Load testing │
Fix order: D7 → D5 → D4 → D2 → D1 → D3 → D6
3. Revisit in 6 Months
| # |
Item |
Current State |
Revisit Question |
Trigger |
| R1 |
Payment modernization |
Frozen. ACL bridge only |
Start migration planning? Extract to .NET 8 service? |
All other services stable + team comfortable + PCI review done |
| R2 |
AI multiplier accuracy |
2x projected |
Was 2x realistic? Measure actual vs projected velocity |
Monthly AI metrics dashboard shows actual output measured |
| R3 |
Database decomposition |
Per-service DBs for all 5 services |
All DBs truly independent? Any lingering shared views? |
Check CDC still running anywhere → should be decommissioned |
| R4 |
React coverage |
3-4 modules in React 18 |
Which legacy pages still not migrated? Need more frontend? |
User feedback on UX inconsistency between new/old modules |
| R5 |
Multi-region active-active |
Active-passive (SEA primary, AU failover) |
User growth justifies US/EU region? SLA requires <100ms? |
Monitor: latency by region, user distribution changes |
| R6 |
Event Sourcing adoption |
Events captured but not sourced |
Event sourcing for high-value aggregates (Booking)? |
Need audit trail? Temporal queries? Business requests analytics |
| R7 |
Service Bus → Kafka |
Azure Service Bus (managed) |
Volume exceeds Service Bus limits? Need streaming? |
> 10K events/second sustained (unlikely at 40K users) |
| R8 |
Container Apps → AKS |
Container Apps (serverless) |
Need more control? Team grown? Service count > 15? |
Platform limitations hit, team size > 10 engineers |
6-Month Review Checklist
Month 7 Review Meeting (after Phase 2 complete):
□ All 5 services live and stable? (metrics: error rate, latency, SLA)
□ AI 2x multiplier achieved? (measure: actual MM delivered vs projected)
□ Payment ACL reliable? (measure: ACL error rate, circuit breaker opens)
□ Team health? (burnout indicators, retention risk)
□ Technical debt acceptable? (list above, reassess severity)
□ Scope for next 6 months? (Payment migration, ML features, multi-region)
Month 9 Handoff Document should include:
□ What was deferred and why
□ Ready-to-execute plan for Payment modernization
□ Capacity needed for Phase 4 (post-9-months)
4. Trade-Off Philosophy
Core principle: OPTIMIZE FOR DELIVERY UNDER CONSTRAINTS
With 5 engineers + 9 months:
✅ Choose "good enough" over "perfect"
✅ Choose managed services over self-hosted (less ops)
✅ Choose one technology over polyglot (less expertise spread)
✅ Choose incremental over complete (ship value early)
✅ Choose deferral over scope creep (say no explicitly)
The discipline is NOT doing things:
"What you don't do defines you as much as what you do."
Every deferred item = capacity freed for what matters now.