Project Management

Project governance, engineering process, stakeholder management, and quality assurance strategy for a team of 5 engineers over 9 months with AI-first approach.

1. Project Management Framework

Why Not Pure Scrum?

Framework	Fit?	Reasoning
Pure Scrum	❌	5 people don't need heavy ceremonies. 4-hour sprint planning for 5 people = waste
Pure Kanban	⚠️	Good for flow, but lacks checkpoints for migration milestones
Shape Up (modified)	✅	6-week cycles + cooldown. Fits migration phases. Appetite-based (fixed time, variable scope)
SAFe	❌	Overkill for 5 people. SAFe is for 50+ engineers

Decision: Shape Up modified + Kanban within cycles. 6-week cycles aligned with migration phases. Appetite-based estimation — fixed time, variable scope. No story points.

Project Rhythm — 6 Cycles × 6 Weeks

Each 6-week cycle: Week 1 → Shaping (define appetite, scope bets) | Week 2–5 → Building (Kanban flow, daily standups) | Week 6 → Cooldown (retro, tech debt, learning)

Estimation — Appetite-Based (No Story Points)

Appetite	Duration	Example
Small Batch	≤ 1 week	Communications service extraction (AI handles most)
Big Batch	2–4 weeks	Travel Booking full extraction + tests + React pages
Epic	1 cycle (6 weeks)	Event Management + Reporting + related React modules

Rule: If a task exceeds 6 weeks → must be broken down further. No "ongoing" tasks.

Task Board

Backlog

Unshaped ideas

Shaped

Scoped, estimated, assigned

Building

In active dev (WIP ≤ 5)

Review

CodeRabbit + human review

Staging

QA on staging env

Done

In production

WIP Limit: 5 (1 per engineer max). No multitasking. If blocked → swarm (help each other unblock).

2. Ceremonies & Cadence

Ceremony	Frequency	Duration	Purpose	Who
Daily Standup	Daily	10 min	Blockers only. No status reporting — use the board	All 5
Cycle Shaping	Every 6 weeks	2 hours	Define appetite, scope bets, assign pitches	Tech Lead + team
Weekly Demo	Weekly	30 min	Show working software to stakeholders	Rotating presenter
Cycle Retro	Every 6 weeks	1 hour	What worked, what didn't, AI effectiveness review	All 5
Architecture Review	Bi-weekly	1 hour	Review ADRs, service boundaries, tech decisions	Tech Lead + senior
AI Workflow Check	Weekly	15 min	AI metrics review, prompt calibration, governance	Tech Lead

Total ceremony time: ~3.5 hours/week — under 10% of working time. The rest = build.

Weekly Cadence

Daily Schedule

09:00 Standup (10 min) — blockers only

09:10 Deep work — NO meetings until 12:00

14:00 Open for ad-hoc pairing, reviews

16:00 Async review — PRs, CodeRabbit comments

17:00 AI batch runs scheduled (overnight migration)

Weekly Schedule

Monday AM: Sprint planning (30 min)

Tue–Thu: Heads-down development (async standups)

Wednesday AM: Architecture review (bi-weekly, 1 hr)

Friday AM: Code review session (60 min cross-service)

Friday PM: Demo (30 min) + AI metrics check (15 min)

3. Team Structure & Roles

Bus Factor Mitigation

With 5 people, 1 person leaving = 20% capacity lost. Every mitigation strategy is critical:

Pair on critical modules

2 people know each service. No single ownership.

Bus factor ≥ 2 for every module

Rotate reviewer

Code review rotates — everyone reviews everyone's code

Cross-knowledge across codebase

AI code walkthrough

Weekly rotation: Senior explains Travel, Backend explains Event...

Everyone understands every service

DevOps cross-train

Every backend eng knows how to deploy their own service

DevOps doesn't become a bottleneck

Service Ownership Map

Service	Primary Owner	Secondary	Frontend
Travel Booking	D2 (Sr Backend)	D1 (Tech Lead)	D4 + D5
Event Management	D3 (Backend)	D2 (Sr Backend)	D4
Workforce	D4 (Fullstack)	D1 (Tech Lead)	D4 + D5
Communications	D4 (Fullstack)	D3 (Backend)	D4
Reporting	D5 (FE/DevOps)	D3 (Backend)	D5 + D4
Payment ACL	D1 (Tech Lead)	D2 (Sr Backend)	—
API Gateway	D1 (Tech Lead)	D5 (FE/DevOps)	—
CI/CD + Infra	D5 (FE/DevOps)	D1 (Tech Lead)	—

4. Stakeholder Management

Communication Plan

Stakeholder	Channel	Frequency	Content
C-Level / Sponsor	Executive summary (1-page)	Bi-weekly	Risk status, milestone progress, AI ROI metrics, budget burn
Product Owner	Demo + written update	Weekly	Working features, migration progress, upcoming changes
Business Users	Change notification	Per migration phase	What's changing, what's not, who to contact
External API Consumers	API deprecation notice	30 days ahead	Breaking changes, migration guides, new endpoints
Engineering Team	Standup + board	Daily	In-progress work, blockers, decisions needed
Security / Compliance	Audit report	Monthly	SAST results, AI governance, payment module status

The "No" Framework — Expectation Management

"Can we add feature X?""Yes, if we defer [Y]. Here's the trade-off."

"Can we speed up?""We're at 2x AI capacity. Adding people adds coordination cost. We can re-scope instead."

"Why isn't Payment modernized?""By design. Constraint: Payment frozen Phase 1. Plan exists for Phase 2. Here's the ACL keeping it safe."

"Can we skip testing?""No. With 75% AI-generated code, testing IS the quality gate. Non-negotiable."

"Competitor launched feature Z""Noted. Added to backlog. Current priority: foundation first."

Escalation Path

P4 (Low)

Engineer fixes → PR → merge. No escalation.

P3 (Medium)

Engineer + Tech Lead discuss. Fix within cycle. Mention in weekly update.

P2 (High)

Tech Lead decides → immediate fix. Notify PO same day. Include in exec summary.

P1 (Critical)

War room (all hands). Tech Lead → Sponsor within 1hr. Hourly updates. Post-mortem 48h.

5. Development Lifecycle

Definition of Done

Code

☐ Feature implemented and builds successfully

☐ AI-generated code reviewed by human (mandatory)

☐ Follows Clean Architecture structure

☐ No TODO/HACK comments left untracked

Testing

☐ Unit tests pass (≥80% coverage for new code)

☐ Contract tests pass (Pact — for API changes)

☐ Integration tests pass (DB, event bus)

☐ No regression in existing tests

Security

☐ SAST scan clean (CodeQL)

☐ No secrets in code

☐ Payment-related: 2 human reviewers approved

Observability

☐ Structured logging for key operations

☐ OpenTelemetry trace spans for cross-service calls

☐ Health check endpoint working

Documentation

☐ API changes reflected in OpenAPI spec

☐ ADR created for architecture decisions

☐ README updated if setup instructions changed

Deployment

☐ Docker image builds successfully

☐ Deployed to staging and tested

☐ Monitoring/alerting configured for new endpoints

6. Code Review Process

Git Workflow

Branch Naming

feature/{module}-{description}

fix/{module}-{description}

infra/{description}

Rules

• PR required for main (no direct push)
• CodeRabbit auto-review on PR create
• ≥1 human approval required
• Payment-related: ≥2 human approvals
• CI must pass (build + test + security)
• Squash merge to main (clean history)

ADR Process

Triggers: Any decision affecting service boundaries, database choice, communication patterns, technology selection, security model, or AI governance rules.

Flow: Engineer drafts ADR (AI-assisted) → Tech Lead reviews (24h) → Team review in Architecture Review → Accepted/Rejected/Amended → Stored in /docs/adrs/ADR-NNN-title.md

7. Release Management

Service Deployment

2–3× per week

Every merged PR → auto-deploy to staging. Manual approval for production. Rolling update (zero downtime). Feature flags for incomplete features.

Module Go-Live

Once per phase

Full module cutover: traffic routes from legacy → new. Canary: 5% → 25% → 50% → 100%. Rollback via YARP < 5 min. Stakeholder notified 1 week before.

Database Migration

1–2 total

Per-service DB cutover. CDC running weeks before. Blue-green: new DB + old DB fallback. Data verification scripts mandatory.

Canary Release Process

Feature Flags

Flag Name	Purpose
travel.new-service	Route traffic to new Travel service
event.new-service	Route traffic to new Event service
react.travel-ui	Show new React UI for Travel
react.event-ui	Show new React UI for Events
ai.smart-routing	Enable AI-based API routing
ai.anomaly-detection	Enable AI monitoring alerts
reporting.cqrs-mode	Use CQRS read models vs legacy

Tool: Azure App Configuration. Rules: All new services behind flags. Per-tenant and per-region capable. Kill switch → fallback to legacy instantly. Flags cleaned up every cycle.

8. Quality Assurance

Testing Pyramid

Level	Tool	AI Contribution	Coverage Target
Unit Tests	xUnit / NUnit	80% AI-generated	≥80% on new code
Contract Tests	Pact	70% AI-generated	Every service boundary
Integration Tests	Docker Compose in CI	50% AI-generated	DB, event bus, APIs
E2E Tests	Cypress / Playwright	30% AI-assisted	Critical paths only
Manual Testing	Human exploratory	0% AI	Friday demo sessions

Quality Gates

PR Level

☐ Build passes

☐ Unit tests (≥80% changed files)

☐ Contract tests pass

☐ SAST clean

☐ CodeRabbit: no critical findings

☐ Human review: approved

Staging Level

☐ Integration tests pass

☐ E2E critical paths pass

☐ Performance baseline not degraded

☐ No new security vulnerabilities

Production Level

☐ All staging gates pass

☐ Feature flag ready (kill switch)

☐ Monitoring/alerting configured

☐ Rollback plan documented

☐ Tech Lead approval

9. Metrics & Reporting

Delivery Metrics

Services migrated5/5 (60% by M6)

API endpoints migrated120 total

React modules live3–4/5

Cycle progress6 cycles on time

Quality Metrics

Test coverage (new code)≥85%

Contract test pass rate100%

Production incidents (P1/P2)0

Zero downtime maintained✅ Yes

AI Metrics

AI-generated code ratio~68%

AI code bug rate vs human0.8x (lower)

AI PR rejection rate~12%

Effective multiplier (measured)1.9x

AI tool cost (monthly)~$1,050

Team Health

Sprint velocity trend↗ ↗ → → (stabilizing)

Team satisfaction (retro)4.2/5

Overtime hours this cycle≤2 (acceptable)

Bus factor per service≥2 (met)

Reporting Cadence

Report	Audience	Frequency	Content
Health Dashboard	Team	Real-time	All metrics above (live board)
Weekly Demo	Product Owner + Business	Weekly	Working features + metrics
Exec Summary	C-Level / Sponsor	Bi-weekly	1-page: milestones, risks, decisions
AI ROI Report	Sponsor	Monthly	AI cost vs productivity, quality comparison
Cycle Report	All stakeholders	Every 6 weeks	Full review: delivered, deferred, learnings
Post-Mortem	Team + stakeholders	Per P1/P2 incident	RCA, prevention, action items

10. Knowledge Management

Documentation Hierarchy

/docs/

├── adrs/ ← Architecture Decision Records

├── runbooks/ ← Operational runbooks

├── api/ ← OpenAPI specs (auto-generated)

├── onboarding/ ← New member guides

└── migration/ ← Migration-specific docs

Rule: Docs live with code (in repo). No separate wiki — prevents doc drift. AI generates first draft, human reviews.

New Engineer Onboarding

Day 1

Dev env setup (AI-assisted), read architecture + AI workflow guides, access repos/CI/Azure

Day 2–3

Pair with Senior, run test suite locally, deploy to staging, review 3 recent PRs

Day 4–5

First small task (bug fix), full PR flow, first AI-assisted dev task

Week 2

Own small feature end-to-end, attend architecture review, AI code walkthrough

Target: Productive contributor by Day 10. AI tools reduce onboarding time by ~40%.

11. Continuous Improvement

Retrospective Framework (Cycle-end, 1 hour)

START

• What should we start doing?
• New AI tools/prompts to try?
• What process is missing?

STOP

• What's wasting our time?
• Which AI patterns aren't working?
• What ceremonies are useless?

CONTINUE

• What's working well?
• Highest ROI AI workflows?
• What should we NOT change?

AI-Specific Questions (added 15 min)

• Where did AI help most this cycle?
• Where did AI cause rework? (hallucination tracking)
• Is our 2x multiplier holding? Actual measurement?
• Any prompt library updates needed?
• AI governance: any near-misses?

Learning Budget

Per engineer, per cycle (6 weeks): 4 hours intentional learning + 2 hours AI experimentation. Cooldown week (Week 6): focus on tech debt, learning, experimentation. Investment: ~6 hours per cycle = 1.5% of working time.

→ Project Phases → Analysis → Team → Risks & Tradeoffs