Testing Strategy
Principle: Test pyramid — many unit tests, few E2E. Contract tests at service boundaries.
Constraint: 5 engineers → testing must be automated, no dedicated QA team
AI Role: AI generates 60–70% test code, human validates edge cases + business rules
Source: Architect.md, Development Guide.md, Deployment.md, Deliverable 4.3 - Failure Modeling.md
1. Test Pyramid
╱╲
╱ ╲
╱ E2E╲ 5% — Critical user journeys only
╱______╲ Playwright, run nightly
╱ ╲
╱Integration╲ 15% — Service + DB + Service Bus
╱____________╲ Testcontainers, run on PR
╱ ╲
╱ Contract Tests ╲ 20% — Service boundary validation
╱__________________╲ Pact, run on PR (blocking)
╱ ╲
╱ Unit Tests ╲ 60% — Business logic, domain model
╱________________________╲ xUnit, run on every commit
Target Coverage: ≥80% line coverage (unit + integration combined)
Business logic: ≥90% branch coverage
Infrastructure: ≥60% (Bicep what-if + deployment tests)
2. Test Types & Responsibilities
2.1 Unit Tests
| Aspect |
Detail |
| Framework |
xUnit + FluentAssertions + Moq/NSubstitute |
| Scope |
Domain entities, value objects, application services, validators, mappers |
| What to test |
Business rules, edge cases, validation logic, state transitions |
| What NOT to test |
EF Core queries (→ integration), HTTP endpoints (→ integration), 3rd party libs |
| Naming |
MethodName_Scenario_ExpectedResult |
| AI role |
AI generates happy path + 2–3 edge cases. Human adds business-specific cases |
| Run |
Every commit (< 30 seconds per service) |
// Example: Travel Service — Booking domain logic
public class BookingTests
{
[Fact]
public void Cancel_WhenWithin24Hours_ShouldRefundFull()
{
var booking = Booking.Create(traveler, flight, DateTime.UtcNow);
booking.Cancel(DateTime.UtcNow.AddHours(12));
booking.Status.Should().Be(BookingStatus.Cancelled);
booking.RefundAmount.Should().Be(booking.TotalAmount);
}
[Fact]
public void Cancel_WhenAfter24Hours_ShouldRefund50Percent()
{
var booking = Booking.Create(traveler, flight, DateTime.UtcNow.AddDays(-2));
booking.Cancel(DateTime.UtcNow);
booking.RefundAmount.Should().Be(booking.TotalAmount * 0.5m);
}
[Fact]
public void Cancel_WhenAlreadyCancelled_ShouldThrowDomainException()
{
var booking = Booking.Create(traveler, flight, DateTime.UtcNow);
booking.Cancel(DateTime.UtcNow);
var act = () => booking.Cancel(DateTime.UtcNow);
act.Should().Throw<DomainException>()
.WithMessage("Booking already cancelled");
}
}
2.2 Integration Tests
| Aspect |
Detail |
| Framework |
xUnit + Testcontainers (SQL Server, Service Bus emulator) |
| Scope |
API endpoints, EF Core queries, message handlers, ACL calls |
| Pattern |
WebApplicationFactory<Program> with real DB (containerized) |
| What to test |
Full request pipeline, DB persistence, message publish/consume |
| What NOT to test |
UI rendering (→ E2E), business logic already covered by unit tests |
| Run |
Every PR (2–5 minutes per service) |
// Example: Travel API integration test
public class BookingApiTests : IClassFixture<TravelApiFactory>
{
[Fact]
public async Task CreateBooking_WithValidData_Returns201AndPublishesEvent()
{
// Arrange
var client = _factory.CreateClient();
var request = new CreateBookingRequest { /* ... */ };
// Act
var response = await client.PostAsJsonAsync("/api/bookings", request);
// Assert
response.StatusCode.Should().Be(HttpStatusCode.Created);
// Verify event published to Service Bus
var publishedEvent = await _factory.GetPublishedEvent<BookingCreatedEvent>();
publishedEvent.Should().NotBeNull();
publishedEvent.BookingId.Should().NotBeEmpty();
}
}
2.3 Contract Tests (Pact)
| Aspect |
Detail |
| Framework |
PactNet (consumer-driven) |
| Scope |
Service-to-service API contracts, ACL contracts |
| Pattern |
Consumer writes expectations → Provider verifies |
| Critical contracts |
Travel → Payment ACL, Event → Comms, Workforce → Reporting |
| Broker |
Pactflow (SaaS) for contract storage + verification |
| Run |
Every PR (blocking — broken contract = blocked merge) |
CONTRACT MAP:
Travel Service (Consumer) ←→ Payment ACL (Provider)
Contract: POST /api/payments/authorize
Request: { bookingId, amount, currency }
Response: { transactionId, status: "authorized"|"declined" }
Event Service (Consumer) ←→ Comms Service (Provider)
Contract: Event "EventCreated" on Service Bus
Schema: { eventId, title, attendees[], notificationType }
Workforce (Consumer) ←→ Reporting (Provider)
Contract: GET /api/reports/team/{teamId}/hours
Response: { teamId, totalHours, members[] }
Why Contract Tests > E2E for microservices?
E2E test 1 user journey:
Click "Book" → Travel → Payment ACL → Legacy → Comms → Email
= 5 services must be running + healthy + correct data
= Flaky, slow (30s+), hard to debug
= One failure anywhere → entire test fails
Contract test same journey:
Travel → Payment ACL: 2 tests (isolated, fast, deterministic)
Travel → Comms: 2 tests (isolated, fast, deterministic)
= Each test runs independently (< 1 second)
= Failure pinpoints EXACTLY which contract broke
= No shared state, no flakiness
2.4 E2E Tests
| Aspect |
Detail |
| Framework |
Playwright (.NET) |
| Scope |
Critical user journeys ONLY (5–10 scenarios max) |
| Environment |
Staging (prod-like, real services) |
| Run |
Nightly (not on PR — too slow, too flaky) |
| Owned by |
D5 (Frontend + DevOps) |
Critical E2E Scenarios:
| # |
Journey |
Services Involved |
Priority |
| 1 |
User logs in → creates booking → sees confirmation |
Auth, Travel, Comms |
P1 |
| 2 |
Manager approves booking → payment authorized |
Travel, Payment ACL |
P1 |
| 3 |
Create event → attendees notified |
Event, Comms |
P2 |
| 4 |
View workforce report → export PDF |
Workforce, Reporting |
P2 |
| 5 |
Canary: new service version serves traffic correctly |
Any (smoke test) |
P1 |
2.5 Performance Tests
| Aspect |
Detail |
| Tool |
k6 (open-source load testing) |
| Scope |
API load test per service + system-wide load |
| Targets |
P95 latency < 200ms, throughput ≥ 100 req/s sustained |
| Scenarios |
Normal load (22 req/s), Peak (100 req/s), Burst (220 req/s, 5 min) |
| Run |
Before each service go-live + monthly thereafter |
| Environment |
Staging with prod-like data volume |
// k6 script example: Travel Service load test
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 50 }, // Ramp up to 50 VUs
{ duration: '5m', target: 100 }, // Sustained 100 VUs
{ duration: '2m', target: 200 }, // Burst
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<200'], // P95 < 200ms
http_req_failed: ['rate<0.01'], // Error rate < 1%
},
};
export default function () {
const res = http.get('https://staging.product.com/api/bookings');
check(res, { 'status is 200': (r) => r.status === 200 });
sleep(1);
}
3. Test Automation in CI/CD
┌────────────────────────────────────────────────────────────────────┐
│ CI PIPELINE (GitHub Actions) — Per Service │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌────────────┐ ┌──────────┐ │
│ │ Build │──▶│ Unit Tests │──▶│ Contract │──▶│Integration│ │
│ │ │ │ (< 30s) │ │ Tests │ │ Tests │ │
│ │ │ │ Coverage ≥80%│ │ (Pact) │ │ (2–5min) │ │
│ └──────────┘ └──────────────┘ └────────────┘ └──────────┘ │
│ │ │ │
│ ▼ ALL MUST PASS ▼ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Security │ gitleaks + CodeQL + Dependabot │ Merge OK │ │
│ │ Scan │────────────────────────────────────────▶│ │ │
│ └──────────┘ └──────────┘ │
│ │
│ ┌── NIGHTLY ──────────────────────────────────────────────────┐ │
│ │ E2E Tests (Playwright) on Staging │ │
│ │ Report → Slack channel #quality │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌── PRE-RELEASE ──────────────────────────────────────────────┐ │
│ │ Performance Tests (k6) on Staging │ │
│ │ Must pass thresholds before go-live approval │ │
│ └─────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
4. Test Data Strategy
| Concern |
Approach |
| Unit tests |
In-memory objects, builder pattern (e.g., BookingBuilder.WithStatus(Cancelled).Build()) |
| Integration tests |
Testcontainers (fresh DB per test class). EF Core migrations applied on startup |
| Contract tests |
Pact fixtures (JSON). Versioned with contract |
| E2E tests |
Seeded test data in staging environment. Dedicated test tenant (isolated from real data) |
| Performance tests |
Synthetic data generator (k6 + Faker). Scale: 10K bookings, 500 events, 5K users |
| PII in tests |
NEVER use production data. Synthetic only. Faker library for realistic fake data |
5. Quality Gates — Per Phase
Phase 0 (Month 1): Foundation
Phase 1 (Month 2–4): First Services
Phase 2 (Month 5–7): Full Suite
Phase 3 (Month 8–9): Hardening
6. AI-Assisted Testing
| Activity |
AI Tool |
Human Role |
| Generate unit test boilerplate |
Cursor Pro (inline) |
Review edge cases, add business-specific scenarios |
| Generate integration test setup |
Claude API (agentic) |
Verify DB assertions, check race conditions |
| Suggest missing test cases |
CodeRabbit (PR review) |
Accept/reject suggestions |
| Generate k6 scripts |
Claude.ai |
Validate thresholds, adjust load profile |
| Generate Pact contracts from OpenAPI |
Claude API |
Verify contract matches actual business rules |
Anti-pattern to prevent:
❌ AI generates 50 unit tests, all pass, developer moves on
→ Problem: tests may only cover happy path
→ Fix: Enforce "mutation testing" spot-checks monthly
(change business logic → test should fail. If not → gap)
❌ AI generates mock that always returns success
→ Problem: failure paths untested
→ Fix: PR review checklist: "Are error scenarios tested?"
7. Test Metrics Dashboard
| Metric |
Target |
Alert If |
| Unit test coverage |
≥80% |
< 75% on any service |
| Contract test count |
≥5 per service boundary |
New endpoint without contract |
| E2E pass rate (nightly) |
≥95% |
< 90% for 3 consecutive nights |
| CI pipeline time |
< 10 min |
> 15 min (optimize or parallelize) |
| Performance P95 latency |
< 200ms |
> 300ms on staging |
| Flaky test rate |
< 2% |
> 5% (quarantine and fix) |
Weekly Test Health Report (Slack #quality):
Service | Coverage | Contracts | E2E | Perf P95 | Status
─────────────┼──────────┼───────────┼──────┼──────────┼───────
Travel | 85% | 12 | 3/3 | 145ms | ✅
Event | 82% | 8 | 2/2 | 120ms | ✅
Workforce | 79% | 6 | 1/1 | 160ms | ⚠️ (coverage)
Comms | 88% | 5 | 2/2 | 95ms | ✅
Reporting | 81% | 4 | 1/1 | 180ms | ✅
ACL | 90% | 10 | 1/1 | 110ms | ✅