Documents/domain/Security Strategy

Security Strategy

Security Strategy

Scope: End-to-end security for the modernization project — from code to infrastructure to process
Constraint: Payment frozen Phase 1 → ACL bridge to legacy → complex security boundary
Standards: OWASP Top 10 (2021), Azure Security Benchmark v3, Zero Trust principles
Source: Architect.md, Deployment.md, HA.md, Business Domain.md


1. Security Architecture Overview

┌──────────────────────────────────────────────────────────────────────────┐
│                        SECURITY LAYERS                                    │
│                                                                          │
│  ┌─ LAYER 1: EDGE ─────────────────────────────────────────────────┐    │
│  │  Azure Front Door + WAF (OWASP ruleset)                        │    │
│  │  DDoS Protection (Standard)                                     │    │
│  │  Rate Limiting: 1000 req/min per IP                            │    │
│  │  Geo-filtering: block known malicious regions                  │    │
│  └────────────────────────────────────────────────────────────────┘    │
│                              │                                          │
│  ┌─ LAYER 2: GATEWAY ──────────────────────────────────────────────┐   │
│  │  YARP API Gateway                                               │   │
│  │  JWT validation (all requests)                                  │   │
│  │  Request/response logging (no PII in logs)                     │   │
│  │  CORS policy enforcement                                       │   │
│  │  Request size limit: 10MB                                      │   │
│  └────────────────────────────────────────────────────────────────┘   │
│                              │                                          │
│  ┌─ LAYER 3: SERVICE ──────────────────────────────────────────────┐   │
│  │  Per-service authentication (internal mTLS via Container Apps) │   │
│  │  Authorization: RBAC per endpoint                              │   │
│  │  Input validation (FluentValidation)                           │   │
│  │  Output encoding (XSS prevention)                              │   │
│  └────────────────────────────────────────────────────────────────┘   │
│                              │                                          │
│  ┌─ LAYER 4: DATA ─────────────────────────────────────────────────┐   │
│  │  Azure SQL: TDE (Transparent Data Encryption) — always on     │   │
│  │  Encryption at rest: AES-256                                   │   │
│  │  Encryption in transit: TLS 1.3 (minimum TLS 1.2)             │   │
│  │  Azure Key Vault: all secrets, connection strings, certs      │   │
│  │  No secrets in code, config, or environment variables          │   │
│  └────────────────────────────────────────────────────────────────┘   │
│                              │                                          │
│  ┌─ LAYER 5: ACL (LEGACY BRIDGE) ──────────────────────────────────┐  │
│  │  Payment stays in monolith → ACL is security boundary          │  │
│  │  ACL validates ALL data crossing boundary                      │  │
│  │  No direct DB access to legacy — API calls only               │  │
│  │  Rate limiting on ACL: prevent cascading abuse                 │  │
│  │  Separate credentials for ACL (not shared with services)      │  │
│  └────────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────────┘

2. OWASP Top 10 — Mitigation Matrix

# Threat Risk Level Mitigation Implementation
A01 Broken Access Control Critical RBAC per endpoint + JWT claims validation + resource-level authorization Middleware in each service. Policy: deny-by-default
A02 Cryptographic Failures High TLS 1.3 transit, AES-256 rest, Key Vault for secrets Azure-managed. No custom crypto
A03 Injection (SQL, XSS, Command) Critical Parameterized queries (EF Core), FluentValidation, output encoding EF Core prevents SQL injection by default. CSP headers for XSS
A04 Insecure Design High Threat modeling per service, abuse case in requirements Security review in PR template. D1 signs off on new endpoints
A05 Security Misconfiguration High Bicep IaC (reproducible config), no default credentials Automated security scan in CI. Azure Policy enforcement
A06 Vulnerable Components Medium Dependabot + CodeQL in CI, NuGet audit on build Weekly automated dependency scan. Block merge if Critical CVE
A07 Auth Failures Critical Azure AD B2C / Identity Server, MFA, account lockout OAuth 2.0 + OIDC. Lockout after 5 failed attempts
A08 Software Integrity Failures Medium Signed container images, SBOM generation, supply chain security Container image signing via Notation. SBOM in CI artifacts
A09 Logging & Monitoring Failures High Structured logging (Serilog), Azure Monitor alerts, audit trail All auth events logged. Alert on anomalies. 90-day retention
A10 SSRF Medium Outbound URL allowlisting, no user-controlled URLs in backend ACL only calls known legacy endpoints. URL validation

3. Authentication & Authorization

3.1 Authentication Flow

                    ┌──────────┐
                    │  User    │
                    │ (Browser)│
                    └────┬─────┘
                         │ 1. Login request
                         ▼
                    ┌──────────┐
                    │ Identity │  Azure AD B2C / Identity Server
                    │ Provider │  (migrated from legacy auth)
                    └────┬─────┘
                         │ 2. JWT token (access + refresh)
                         ▼
                    ┌──────────┐
                    │  React   │  Token stored in httpOnly cookie
                    │ Frontend │  (NOT localStorage — XSS risk)
                    └────┬─────┘
                         │ 3. API call + JWT in Authorization header
                         ▼
                    ┌──────────┐
                    │   YARP   │  4. Validate JWT signature + expiry
                    │ Gateway  │     Extract claims (userId, roles, tenantId)
                    └────┬─────┘
                         │ 5. Forward to service with validated claims
                         ▼
                    ┌──────────┐
                    │ Service  │  6. Check RBAC policy for endpoint
                    │          │     Resource-level auth (can user X access booking Y?)
                    └──────────┘

3.2 Authorization Model

RBAC Roles:
  ├── Admin         → Full access (user management, reports, settings)
  ├── Manager       → Department-level (approve bookings, view team reports)
  ├── User          → Self-service (create own bookings, view own data)
  └── System        → Service-to-service (internal only, no UI)

Resource-Level Authorization:
  Travel:    User can only see/edit OWN bookings (unless Manager of same dept)
  Event:     Event creator + assigned managers can edit
  Workforce: Manager sees team. User sees self only
  Comms:     Recipient-based (user sees messages addressed to them)
  Reporting: Role-based (Manager sees dept, Admin sees all)
  Payment:   LEGACY — existing auth rules preserved via ACL

3.3 Service-to-Service Authentication

Communication Auth Method Details
Service → Service (sync) mTLS (managed by Container Apps) Azure Container Apps internal ingress = auto mTLS
Service → Service Bus Managed Identity No connection strings. Azure RBAC on Service Bus
Service → Azure SQL Managed Identity No SQL passwords. AAD auth for DB
Service → Key Vault Managed Identity RBAC: each service → only its own secrets
ACL → Legacy Monolith API Key + IP whitelist Legacy may not support modern auth. Locked to ACL IP
CI/CD → Azure Service Principal (OIDC) Federated credentials. No stored secrets in GitHub

4. Data Security

4.1 Data Classification

Category Examples Encryption Access Control Retention
PII (Personal) Name, email, phone, passport Encrypted at rest + transit. Column-level encryption for passport Need-to-know basis. Audit log on access Per GDPR/local law
Financial Payment details, invoices Payment stays in legacy (Phase 1). ACL never stores payment data Payment = legacy auth only PCI-DSS (legacy)
Business Bookings, events, schedules Standard encryption (TDE) Service-level access. No cross-service DB reads 7 years
System Logs, metrics, traces Standard encryption DevOps team + on-call 90 days (logs), 30 days (traces)

4.2 Data Protection Rules

RULE 1: No PII in logs. EVER.
        Serilog destructuring policy: mask email, phone, passport.
        Violation = blocked deploy (custom Roslyn analyzer).

RULE 2: No secrets in source code.
        Pre-commit hook: detect patterns (connection strings, API keys).
        CI check: gitleaks scan on every PR.

RULE 3: Payment data NEVER crosses ACL boundary.
        ACL returns: payment_status, transaction_id only.
        No card numbers, no CVV, no bank details in new services.

RULE 4: Database access = Managed Identity only.
        No SQL username/password anywhere.
        Azure AD authentication for all DB connections.

RULE 5: All inter-service communication = encrypted (TLS 1.2+).
        Container Apps internal ingress = auto mTLS.
        External = TLS 1.3 preferred.

5. Infrastructure Security

5.1 Network Security

┌────────────────────────────────────────────────────────────────┐
│  NETWORK TOPOLOGY                                               │
│                                                                  │
│  Internet → Front Door (WAF) → VNET Integration                │
│                                    │                            │
│              ┌─────────────────────┼──────────────────────┐     │
│              │     Container Apps Environment             │     │
│              │     (VNET-injected, private ingress)       │     │
│              │                                             │     │
│              │  ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐     │     │
│              │  │Travel│ │Event │ │WF    │ │Comms │     │     │
│              │  └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘     │     │
│              │     └────────┴────────┴────────┘          │     │
│              │              │                             │     │
│              └──────────────┼─────────────────────────────┘     │
│                             │ Private Endpoint                  │
│              ┌──────────────▼─────────────────────────────┐     │
│              │  Azure SQL (Private Link)                  │     │
│              │  Service Bus (Private Endpoint)            │     │
│              │  Key Vault (Private Endpoint)              │     │
│              └────────────────────────────────────────────┘     │
│                                                                  │
│  ACL → Legacy: VPN / ExpressRoute (NOT over public internet)   │
└────────────────────────────────────────────────────────────────┘

Key points:
  ✓ No public IP on databases
  ✓ No public IP on Service Bus
  ✓ Container Apps: internal ingress (only Front Door can reach)
  ✓ Legacy connection: VPN/ExpressRoute (encrypted tunnel)

5.2 Azure Policy Enforcement (Bicep)

Policy Effect Scope
Require HTTPS on Container Apps Deny All resource groups
Require TLS 1.2+ on Azure SQL Deny All resource groups
Require Private Endpoints for PaaS Audit → Deny Production
Block public IP creation Deny Production
Require encryption on storage Deny All resource groups
Require diagnostic settings DeployIfNotExists All resource groups
Restrict allowed regions (SEA, AU) Deny Subscription
Require tags (env, service, owner) Deny All resource groups

6. CI/CD Security

6.1 Pipeline Security Gates

PR Created
    │
    ▼
┌───────────────────┐
│ 1. SECRET SCAN    │  gitleaks: detect leaked secrets
│    (pre-merge)    │  Block if found. No exceptions.
└───────┬───────────┘
        │
┌───────▼───────────┐
│ 2. SAST SCAN      │  CodeQL + SonarCloud
│    (static)       │  Block on Critical/High findings
└───────┬───────────┘
        │
┌───────▼───────────┐
│ 3. DEPENDENCY     │  Dependabot + NuGet audit
│    AUDIT          │  Block on Critical CVE (CVSS ≥ 9.0)
└───────┬───────────┘
        │
┌───────▼───────────┐
│ 4. BUILD + TEST   │  Unit + Contract tests
│                   │  Code coverage gate (≥80%)
└───────┬───────────┘
        │
┌───────▼───────────┐
│ 5. CONTAINER      │  Trivy scan on Docker image
│    SCAN           │  Block on Critical/High OS vulnerabilities
└───────┬───────────┘
        │
┌───────▼───────────┐
│ 6. CODE REVIEW    │  CodeRabbit (AI) + Human reviewer
│                   │  Security-sensitive: 2 human reviewers
└───────┬───────────┘
        │
┌───────▼───────────┐
│ 7. MERGE          │  All gates green = auto-merge eligible
│                   │  CODEOWNERS enforced
└───────────────────┘

6.2 CODEOWNERS for Security-Sensitive Code

# .github/CODEOWNERS

# Payment ACL — 2 reviewers required
src/services/acl/           @d1-techlead @d2-senior

# Authentication/Authorization
src/shared/auth/            @d1-techlead
src/shared/middleware/      @d1-techlead

# Infrastructure as Code
infra/bicep/                @d1-techlead @d5-devops

# CI/CD pipelines
.github/workflows/          @d1-techlead @d5-devops

# All other code — 1 reviewer
*                           @team-backend

7. Security per Bounded Context

Service Specific Threats Mitigations
Travel Booking manipulation, price tampering Server-side price calculation. Booking ownership validation. Audit trail on all changes
Event Unauthorized event access, capacity overflow Event-level RBAC. Capacity check = server-side atomic operation (not client)
Workforce PII exposure (employee data), privilege escalation Column-level encryption for sensitive fields. Manager can only see own department
Comms Message interception, spam injection End-to-end TLS. Rate limit on message creation. Content sanitization (HTML strip)
Reporting Data aggregation exposure, export abuse Role-based data filtering at query level. Export rate limiting. Watermark on PDF exports
Payment (ACL) Man-in-the-middle, replay attacks ACL validates request signatures. Idempotency keys. VPN to legacy. No payment data stored in new services

8. Incident Response

8.1 Security Incident Classification

Severity Examples Response Time Escalation
P1 Critical Data breach, payment compromise, auth bypass < 1 hour D1 + management + legal
P2 High Successful SQL injection, privilege escalation < 4 hours D1 + D2
P3 Medium Failed brute force (blocked), minor XSS found < 24 hours On-call engineer
P4 Low Dependency CVE (no exploit available) < 1 week Next sprint

8.2 Incident Playbook

1. DETECT     → Azure Monitor alert / CodeQL finding / user report
2. CONTAIN    → Isolate affected service (scale to 0 or block route in YARP)
3. ASSESS     → Determine scope: which data, which users, which services
4. REMEDIATE  → Fix vulnerability, rotate credentials, patch dependency
5. RECOVER    → Restore service, verify fix, monitor for recurrence
6. REVIEW     → Post-incident: root cause, timeline, lessons learned
7. IMPROVE    → Update policies, add detection rules, close gaps

9. Compliance Considerations

Area Requirement Our Approach
GDPR / Data Privacy Right to erasure, data portability, consent Per-service data ownership → clear deletion paths. Export API per service
PCI-DSS Payment card data protection Payment stays in legacy monolith (PCI scope stays there). New services NEVER touch card data
SOC 2 Security controls, availability, confidentiality Azure compliance covers infrastructure. Our responsibility: access control, logging, encryption
Vietnam Cybersecurity Law Data localization for certain categories Primary region = SEA (Singapore). If VN data localization required → Azure Vietnam region (future)

10. Security Checklist — Phase Gates

Phase 0 (Month 1)

  • Key Vault provisioned, Managed Identities configured
  • CODEOWNERS file in place
  • gitleaks + CodeQL in CI pipeline
  • Security training session for team (OWASP Top 10 refresher)

Phase 1 (Month 2–4)

  • VNET integration for Container Apps
  • Private Endpoints for SQL + Service Bus
  • JWT validation in YARP Gateway
  • ACL security: VPN tunnel, request signing, rate limiting
  • First penetration test (Travel service)

Phase 2 (Month 5–7)

  • All services: RBAC + resource-level authorization verified
  • PII masking in logs confirmed (audit)
  • Container image scanning (Trivy) in CI
  • DAST scan on staging environment

Phase 3 (Month 8–9)

  • Full security audit (external or internal)
  • Incident response drill
  • Compliance documentation finalized
  • Security runbook for on-call team