AI Strategy — Team Engineering Playbook
AI strategy for a team of 5 engineers, legacy .NET modernization → .NET 8 microservices.
Updated: March 2026
1. AI Model Landscape — Strengths & Weaknesses
1.1 LLM Models (Core Reasoning)
| Model |
Provider |
Strengths |
Weaknesses |
Best For |
Cost |
| Claude Opus 4 |
Anthropic |
Deepest reasoning. Excellent at architecture, complex refactoring, long-context (200K). Very low hallucination on code. Strong instruction following |
Slowest. Most expensive. Sometimes too verbose |
Architecture decisions, complex migration logic, code review, long codebase analysis |
$$$$ |
| Claude Sonnet 4 |
Anthropic |
Best speed/quality balance. Very strong code generation. Good agentic mode. 200K context |
Not as deep as Opus for architecture reasoning. Sometimes misses edge cases vs Opus |
Daily coding, agentic workflows, PR reviews, test generation |
$$ |
| Claude Haiku 3.5 |
Anthropic |
Extremely fast. Cheapest in Claude family. Good for simple tasks |
Significantly lower quality for complex logic. Higher hallucination rate |
Quick autocomplete, simple refactors, formatting, boilerplate |
$ |
| GPT-4o |
OpenAI |
Multimodal (reads UI screenshots). Good for frontend. Strong function calling. Good tool use |
Code quality not on par with Claude for backend .NET. Smaller context window (128K). Sometimes "lazy" on long tasks |
Frontend React generation from mockup, UI analysis, multimodal tasks |
$$$ |
| GPT-4o mini |
OpenAI |
Cheap, fast. Good for simple tasks |
Low quality for complex code. Not suitable for migration logic |
Simple queries, formatting, quick lookups |
$ |
| o3 |
OpenAI |
Extremely strong reasoning (chain-of-thought). Good for algorithmic problems, math-heavy logic |
Very slow. Very expensive. Overkill for typical code generation. Not agentic-ready |
Allocation algorithms, complex business rules validation, performance optimization analysis |
$$$$ |
| Gemini 2.5 Pro |
Google |
Largest context window (1M tokens). Good for "dump entire codebase". Decent code generation |
Less stable than Claude/GPT for .NET. Output formatting sometimes inconsistent. Weaker instruction following |
Legacy codebase full-scan, massive context analysis, cross-module dependency mapping |
$$$ |
| Gemini 2.5 Flash |
Google |
Fast, large context (1M), cheap |
Significantly lower quality than Pro. Hallucination on complex tasks |
Quick large codebase queries, simple analysis |
$ |
| DeepSeek V3 |
DeepSeek |
Near GPT-4o quality, 10-20x cheaper. Very good at coding (especially Python, JS) |
.NET/C# not a strong suit. Censorship issues (China-based). High latency at peak. Privacy concerns for enterprise code |
Cost-effective coding for non-sensitive code, experimentation |
$ |
| Llama 3.3 70B |
Meta (Open) |
Runs locally → zero data leak. Free. Good for sensitive code (payment) |
Lower quality than commercial models. Requires powerful GPU. No built-in agentic tools |
Payment-related code review (local, no data sent out), offline backup |
Free |
| Qwen 2.5 Coder 32B |
Alibaba (Open) |
Best open-source for coding. Runs locally. Very good code completion |
Smaller than commercial models. Small context window. Not good for architecture reasoning |
Local code completion fallback, sensitive code assistance |
Free |
1.2 AI-Powered Tools (not just models)
| Tool |
Type |
Model Behind |
Strengths |
Weaknesses |
Best For |
| Cursor Pro |
IDE |
Claude Sonnet 4 + GPT-4o (switchable) |
Agentic mode (multi-file edit). @codebase for full context. Excellent tab completion. Composer mode = AI-driven development |
Cost ($20/person/month). Lock-in to 1 IDE. Agent mode sometimes too aggressive |
Primary IDE for team — daily coding, migration, refactoring |
| GitHub Copilot Enterprise |
IDE Extension |
GPT-4o + Claude (preview) |
Deep GitHub integration. Copilot Chat. PR summaries. Knowledge bases |
Weaker than Cursor in agentic mode. Autocomplete sometimes generic. Less model selection control |
Alternative if team uses VS/Rider instead of Cursor |
| Claude Code |
CLI Agent |
Claude Sonnet 4 / Opus 4 |
Terminal-based agentic AI. Self-plans → codes → tests → commits. Batch operations. Script automation |
CLI only (no UI). Learning curve. Token cost if used heavily |
Batch migration tasks, CI/CD scripting, large-scale refactoring |
| Aider |
CLI Agent |
Any model (configurable) |
Open-source. Model-agnostic. Git-aware (auto-commit). Maps entire repo |
Less polished than Claude Code. More complex setup. Community support only |
Budget alternative for CLI agent. Good for git-heavy workflows |
| CodeRabbit |
PR Review |
Multiple models |
Auto-reviews every PR. Contextual suggestions. Tracks patterns across PRs |
Sometimes noisy (too many minor comments). Needs rule tuning. Cost per repo |
Automated first-pass PR review — reduce human review load |
| Copilot Workspace |
Planning |
GPT-4o |
Plan → implement end-to-end from issue description. Multi-file. Good UX |
New, still rough edges. Sometimes plans wrong architecture. Tightly coupled to GitHub |
Issue → implementation for well-defined tasks |
| Continue.dev |
IDE Extension |
Any model (configurable) |
Open-source. Model-agnostic. Custom commands. Tab + Chat + Agent |
Less polished than Cursor/Copilot. Community-driven. More bugs |
Budget option if IDE AI needed. Good customizability |
1.3 Specialized AI Tools
| Tool |
Purpose |
Strengths |
Weaknesses |
| GitHub Copilot Autofix |
Security vulnerability fix |
Auto-fix CodeQL findings. Integrated CI |
Only works with CodeQL. Does not catch every vulnerability |
| Snyk AI |
Security scanning + fix |
Broad vulnerability DB. AI fix suggestions |
Cost. Sometimes false positives |
| Mintlify / Readme AI |
Documentation generation |
Auto-generate API docs from code |
Needs manual review. Sometimes misses context |
| Figma → React (AI) |
UI code generation |
Convert design → React components |
Output needs cleanup. Does not understand business logic |
| Ollama |
Local model hosting |
Run any open-source model locally. Privacy |
Need beefy hardware. Quality < cloud models |
| LM Studio |
Local model hosting (GUI) |
User-friendly. Easy model switching |
Same hardware limitations as Ollama |
2. Model Selection Strategy — Which Model for What
2.1 Decision Matrix
Task Complexity vs Sensitivity Matrix:
LOW SENSITIVITY HIGH SENSITIVITY
(general code) (payment, PII, auth)
───────────────────── ──────────────────────
HIGH COMPLEXITY │ Claude Opus 4 │ Claude Opus 4 (with
(architecture, │ o3 (for algorithms) │ strict review gate)
complex logic) │ Gemini 2.5 Pro (analysis) │ + Llama 3.3 local
│ │ (double-check)
───────────────────── ──────────────────────
MEDIUM COMPLEXITY │ Claude Sonnet 4 │ Claude Sonnet 4
(service migration, │ Cursor Agent mode │ + mandatory human
business logic) │ Claude Code (batch) │ review
───────────────────── ──────────────────────
LOW COMPLEXITY │ Claude Haiku 3.5 │ Qwen 2.5 Coder
(CRUD, boilerplate, │ GPT-4o mini │ (local, no data leak)
formatting) │ Cursor Tab completion │ Llama 3.3 (local)
───────────────────── ──────────────────────
2.2 Per-Task Model Assignment
| Task |
Primary Model |
Backup |
Reasoning |
| Architecture design |
Claude Opus 4 |
o3 (second opinion) |
Deepest reasoning, least hallucination |
| Legacy code analysis |
Gemini 2.5 Pro |
Claude Opus 4 |
1M context = dump entire monolith |
| Service scaffolding |
Claude Sonnet 4 (Cursor) |
Claude Code (CLI) |
Best speed/quality for code gen |
| Business logic migration |
Claude Sonnet 4 |
Claude Opus 4 (review) |
Sonnet writes, Opus reviews critical logic |
| React component generation |
GPT-4o (multimodal) |
Claude Sonnet 4 |
GPT-4o reads UI mockups → React |
| Test generation |
Claude Sonnet 4 |
Claude Haiku 3.5 (unit tests) |
Sonnet for contract tests, Haiku for simple units |
| Data migration scripts |
Claude Sonnet 4 + Claude Code |
— |
AI reads schema → generates CDC scripts |
| PR review (auto) |
CodeRabbit |
GitHub Copilot Review |
First pass, flag issues for human |
| CI/CD pipeline |
Claude Code (CLI) |
Cursor Agent |
Agent generates + tests pipeline code |
| IaC (Bicep) |
Claude Sonnet 4 |
Claude Code |
Template generation, AI validates configs |
| API documentation |
Claude Haiku 3.5 |
GPT-4o mini |
Simple task, fast model is enough |
| Payment-related code |
Llama 3.3 (local) + human |
Claude Opus 4 (review only) |
Privacy: payment code stays local. Cloud AI only reviews, doesn't see raw PII |
| Debugging |
Claude Sonnet 4 (Cursor inline) |
Claude Opus 4 (complex bugs) |
Cursor inline debug → fast. Opus for hard ones |
| Performance analysis |
o3 |
Claude Opus 4 |
o3 reasoning for algorithmic optimization |
3. Team AI Stack — Recommended Setup
3.1 Per-Engineer Setup
┌─────────────────────────────────────────────────────────┐
│ Engineer Workstation │
├─────────────────────────────────────────────────────────┤
│ │
│ Primary IDE: Cursor Pro ($20/month) │
│ ├── Model: Claude Sonnet 4 (default) │
│ ├── Switchable: Claude Opus 4 (complex tasks) │
│ ├── Switchable: GPT-4o (frontend/multimodal) │
│ ├── Agent Mode: ON (multi-file operations) │
│ ├── @codebase: Indexed (legacy + new code) │
│ └── Custom rules: .cursorrules file per project │
│ │
│ CLI Agent: Claude Code ($?? API cost) │
│ ├── For: batch migrations, CI/CD scripting │
│ ├── Model: Claude Sonnet 4 (default), Opus (complex) │
│ └── Git-aware: auto-branch, auto-commit, auto-PR │
│ │
│ Local Models: Ollama │
│ ├── Qwen 2.5 Coder 32B (code completion, offline) │
│ ├── Llama 3.3 70B (sensitive code review) │
│ └── For: payment code, PII handling, offline fallback │
│ │
│ Browser: Claude.ai Pro / ChatGPT Plus │
│ ├── For: architecture brainstorming, documentation │
│ ├── Claude: deep reasoning, long documents │
│ └── ChatGPT: multimodal (screenshot → code) │
│ │
└─────────────────────────────────────────────────────────┘
3.2 Team-Level Infrastructure
┌─────────────────────────────────────────────────────────┐
│ Team AI Infrastructure │
├─────────────────────────────────────────────────────────┤
│ │
│ Code Review Pipeline: │
│ PR Created → CodeRabbit (auto-review) → Human Review │
│ │
│ Shared Knowledge: │
│ ├── Prompt Library (git repo) │
│ │ ├── migration/analyze-module.md │
│ │ ├── migration/scaffold-service.md │
│ │ ├── migration/generate-tests.md │
│ │ ├── migration/generate-cdc.md │
│ │ ├── review/security-checklist.md │
│ │ └── review/business-logic-validation.md │
│ ├── .cursorrules (shared across team) │
│ ├── Claude Code CLAUDE.md (project conventions) │
│ └── ADR templates (AI-generated, human-reviewed) │
│ │
│ MCP Servers (Model Context Protocol): │
│ ├── Database MCP → AI queries live schema │
│ ├── CI/CD MCP → AI triggers builds, reads results │
│ ├── Docs MCP → AI reads project wiki/confluence │
│ └── Legacy API MCP → AI calls legacy endpoints in dev │
│ │
│ Metrics Dashboard: │
│ ├── AI-generated code % per service │
│ ├── AI PR review acceptance rate │
│ ├── Time per module migration (trending) │
│ ├── Bug rate: AI-generated vs human-written code │
│ └── Cost: AI API spend per month │
│ │
└─────────────────────────────────────────────────────────┘
3.3 Cost Estimation
| Item |
Per Engineer/Month |
Team (5)/Month |
Note |
| Cursor Pro |
$20 |
$100 |
Primary IDE |
| Claude API (Cursor + Claude Code) |
~$80-150 |
$400-750 |
Heavy agentic usage |
| Claude.ai Pro |
$20 |
$100 |
Browser access for brainstorming |
| ChatGPT Plus |
$20 |
$100 |
Multimodal, frontend |
| CodeRabbit |
~$15/seat |
$75 |
PR review automation |
| Ollama + local models |
$0 |
$0 |
Free, runs on local hardware |
| Total |
~$155-225 |
~$775-1,125 |
|
ROI Justification: $1,125/month ÷ 17 extra engineer-months gained = ~$66/engineer-month gained. An engineer costs $3-5K/month → AI cost = 2-3% of equivalent human cost for 63% more output.
4. AI Workflow — Day-to-Day Operations
4.1 Developer Daily Flow
Morning Standup:
└── Review AI-generated PRs from overnight Claude Code runs
└── CodeRabbit flagged issues → address in first hour
Task Assignment → AI-First Approach:
┌────────────────────────────────────────────┐
│ 1. Open task (Jira/Linear) │
│ 2. Cursor: @codebase → understand context │
│ 3. Think: what approach? what files? │ ← HUMAN DECISION
│ 4. Cursor Agent: implement across files │ ← AI EXECUTES
│ 5. Cursor: generate tests │ ← AI EXECUTES
│ 6. Review AI output: logic correct? │ ← HUMAN VALIDATES
│ 7. Run tests locally │
│ 8. Push → CodeRabbit auto-review │
│ 9. Fix CodeRabbit issues │
│ 10. Human reviewer → merge │
└────────────────────────────────────────────┘
Batch Migration (Claude Code):
┌────────────────────────────────────────────┐
│ 1. Prep: identify 10 files to migrate │ ← HUMAN SCOPES
│ 2. Claude Code: "migrate these 10 files │ ← AI BATCH EXECUTES
│ from .NET Framework to .NET 8 following │
│ template X" │
│ 3. Review diff: business logic intact? │ ← HUMAN VALIDATES
│ 4. Run tests: all green? │
│ 5. Push PR │
└────────────────────────────────────────────┘
4.2 Migration-Specific AI Workflow
Module Migration Flow (AI-Driven):
Step 1: AI Analysis (Gemini 2.5 Pro — full codebase context)
Input: Legacy module source code (entire module)
Prompt: "Analyze this .NET module. Output: bounded context,
dependencies, database tables, API endpoints,
business rules, hidden coupling."
Output: Module analysis document (AI-generated, human-reviewed)
Step 2: AI Scaffolding (Claude Sonnet 4 — Cursor Agent)
Input: Analysis doc + .NET 8 service template
Prompt: "Generate .NET 8 microservice scaffold for [Module].
Include: project structure, DI setup, EF Core context,
API controllers (from legacy), middleware, health checks."
Output: Working .NET 8 project skeleton
Step 3: AI Business Logic Migration (Claude Sonnet 4 / Opus 4)
Input: Legacy business logic files + scaffold
Prompt: "Migrate this business logic from .NET Framework to .NET 8.
Preserve ALL behavior. Flag any ambiguous logic."
Output: Migrated business logic (REQUIRES HUMAN REVIEW)
Gate: ⚠️ MANDATORY human review of every business rule
Step 4: AI Test Generation (Claude Sonnet 4)
Input: Migrated service + legacy test files (if any)
Prompt: "Generate: 1) Contract tests (Pact) for all APIs,
2) Unit tests for business logic,
3) Integration tests for DB operations.
Cover: happy path + edge cases from legacy behavior."
Output: Test suite (80%+ coverage)
Step 5: AI Data Migration (Claude Code — batch)
Input: Legacy DB schema + new service schema
Prompt: "Generate CDC setup: 1) Source connector (legacy DB),
2) Sink connector (new service DB),
3) Transformation rules, 4) Verification queries."
Output: CDC configuration + verification scripts
Step 6: AI Review + Human Review (2-gate)
Gate 1: CodeRabbit auto-review → flag issues
Gate 2: Human reviewer → focus on:
- Business logic correctness
- Security implications
- Performance concerns
- Edge cases AI might miss
5. Prompt Library — Core Templates
5.1 Migration Prompts
## prompt: analyze-legacy-module
You are analyzing a legacy .NET Framework module for migration to .NET 8 microservices.
Input: [paste module code or use @codebase reference]
Output the following:
1. **Bounded Context**: What domain does this module own?
2. **Dependencies**: What other modules/services does it call?
3. **Database Tables**: Which tables does it read/write?
4. **API Endpoints**: List all controllers + routes
5. **Business Rules**: Extract every business rule (if-then logic, validations, calculations)
6. **Hidden Coupling**: Any shared state, static variables, global config?
7. **Migration Risks**: What could break during migration?
8. **Estimated Effort**: Simple/Medium/Complex per component
Format as structured markdown.
## prompt: scaffold-dotnet8-service
Generate a .NET 8 microservice for the [ServiceName] bounded context.
Requirements:
- Clean Architecture (API → Application → Domain → Infrastructure)
- EF Core 8 with [DatabaseType] provider
- Minimal API or Controller-based (follow team convention)
- Azure Service Bus integration for events
- Health checks endpoint
- Structured logging (Serilog)
- OpenTelemetry for tracing
- Docker support (Dockerfile + docker-compose)
Include:
- Project structure with all .csproj files
- DI registration
- appsettings.json template
- Dockerfile
- Initial migration
Do NOT include:
- Authentication (handled by API Gateway)
- Complex business logic (will be migrated separately)
## prompt: migrate-business-logic
Migrate the following .NET Framework business logic to .NET 8.
Source code:
[paste legacy code]
Rules:
1. Preserve ALL existing behavior — this is migration, not refactoring
2. Update syntax: async/await patterns, nullable reference types, records where appropriate
3. Replace deprecated APIs with .NET 8 equivalents
4. Keep the same method signatures unless technically required to change
5. Flag with // TODO: REVIEW any logic that seems ambiguous or potentially incorrect
6. Add XML doc comments for complex business rules
Output: Complete migrated code + list of changes made + list of items flagged for review.
## prompt: generate-contract-tests
Generate Pact contract tests for the following API migration.
Legacy API spec:
[paste OpenAPI/Swagger or controller code]
New API spec:
[paste new service controllers]
Requirements:
1. Consumer-driven contract tests
2. Verify: every legacy endpoint has equivalent new endpoint
3. Verify: response schemas are backward compatible
4. Verify: error codes are preserved
5. Test: pagination, filtering, sorting if applicable
6. Include: provider state setup
5.2 Review Prompts
## prompt: review-ai-generated-code
Review this AI-generated code for a .NET 8 microservice migration.
Code to review:
[paste code]
Original legacy code:
[paste original]
Check for:
1. **Behavior preservation**: Does new code do EXACTLY what old code did?
2. **Security**: SQL injection, XSS, auth bypass, secrets exposure?
3. **Performance**: N+1 queries, missing indexes, unnecessary allocations?
4. **Error handling**: Are all legacy error paths preserved?
5. **Edge cases**: Null handling, boundary values, concurrent access?
6. **AI hallucination**: Any method calls that don't exist? Made-up APIs?
7. **.NET 8 best practices**: Proper async/await, DI usage, configuration?
Output: List of issues (Critical/Warning/Info) with specific line references.
6. Governance & Safety Rules
6.1 Non-Negotiable Rules
🔴 ABSOLUTE RULES (zero exceptions):
1. NEVER commit AI-generated code without running tests
2. NEVER skip human review for business logic changes
3. NEVER send payment/PII data to cloud AI models
4. NEVER use AI output for security decisions without human sign-off
5. NEVER deploy AI-generated IaC without human review of permissions/networking
6. NEVER trust AI-generated SQL migrations without dry-run on staging
7. ALL prompts that generate production code must be versioned in git
6.2 Review Gate Matrix
| Code Category |
AI Generate? |
AI Review? |
Human Review? |
Deploy Gate |
| Boilerplate / CRUD |
✅ Fully |
✅ CodeRabbit |
✅ Quick scan |
CI pass |
| Business logic |
✅ Draft |
✅ CodeRabbit |
✅ Deep review |
CI + reviewer approval |
| Payment code |
⚠️ Local AI only |
✅ (no PII in prompt) |
✅ 2 reviewers |
CI + 2 approvals + security scan |
| Data migration |
✅ Generate |
✅ CodeRabbit |
✅ Deep review |
Staging dry-run + approval |
| IaC / Infrastructure |
✅ Generate |
✅ CodeRabbit |
✅ Deep review |
Plan review + approval |
| CI/CD pipeline |
✅ Generate |
✅ CodeRabbit |
✅ Quick scan |
Test in feature branch first |
| API contracts |
✅ Draft |
✅ CodeRabbit |
✅ Contract test must pass |
Consumer tests green |
6.3 Weekly AI Health Check
Every week, Tech Lead runs:
□ AI-generated code bug rate vs human: trending up or down?
□ CodeRabbit rejection rate: are AI PRs getting better?
□ Any security findings in AI-generated code this week?
□ API cost: within budget?
□ Team sentiment: AI helping or frustrating?
□ Knowledge check: pick random AI-generated file, ask engineer to explain
7. Model Comparison — Quick Reference Card
7.1 Coding Quality Ranking (for .NET 8 / React)
Backend (.NET 8 / C#):
1. Claude Opus 4 ████████████████████ (best .NET understanding)
2. Claude Sonnet 4 █████████████████ (excellent, slightly less depth)
3. GPT-4o ██████████████ (good but less .NET-specific)
4. o3 █████████████ (good reasoning but slow)
5. Gemini 2.5 Pro ████████████ (decent, not .NET-focused)
6. DeepSeek V3 ███████████ (good general, weak .NET)
7. Llama 3.3 70B █████████ (local option, acceptable)
8. Qwen 2.5 Coder ████████ (local, good completion)
Frontend (React 18 / TypeScript):
1. Claude Sonnet 4 ████████████████████ (best React code gen)
2. GPT-4o ███████████████████ (great, especially from mockup)
3. Claude Opus 4 █████████████████ (excellent but overkill for UI)
4. DeepSeek V3 ████████████████ (strong JS/TS ecosystem)
5. Gemini 2.5 Pro ███████████████ (decent)
6. Qwen 2.5 Coder ██████████████ (good for components)
7. Llama 3.3 70B ████████████ (acceptable)
Long Context Analysis (legacy codebase scan):
1. Gemini 2.5 Pro ████████████████████ (1M tokens, dump entire codebase)
2. Claude Opus 4 █████████████████ (200K, excellent analysis)
3. Claude Sonnet 4 ████████████████ (200K, fast analysis)
4. GPT-4o ████████████ (128K, decent)
5. Others ████████ (limited context)
Agentic / Autonomous Tasks:
1. Claude Sonnet 4 ████████████████████ (best agentic model — Cursor, Claude Code)
2. Claude Opus 4 ███████████████████ (powerful but expensive for agentic loops)
3. GPT-4o ██████████████ (tool-use good, less autonomous)
4. Others ██████████ (not recommended for agentic)
7.2 When to Switch Models
Start with Claude Sonnet 4 (Cursor default)
│
├── Task too complex → switch to Claude Opus 4
│ (architecture decisions, complex debugging, critical business logic)
│
├── Need to analyze massive codebase → switch to Gemini 2.5 Pro
│ (dump 50+ files, get cross-module dependency map)
│
├── Need UI from mockup → switch to GPT-4o
│ (screenshot → React component)
│
├── Algorithm optimization → switch to o3
│ (allocation algorithms, complex scheduling logic)
│
├── Simple boilerplate → switch to Claude Haiku 3.5
│ (save cost, fast enough)
│
└── Sensitive code (payment) → switch to Llama 3.3 local
(zero data leak, privacy guaranteed)
8. Risks of AI-Heavy Approach & Mitigations
| Risk |
Severity |
Mitigation |
| "Nobody understands the code" — AI writes 70%, human forgets logic |
HIGH |
Weekly code walkthrough. Random "explain this" tests. Engineers OWN their services |
| AI model deprecation — provider stops model |
MEDIUM |
Multi-provider strategy. No lock-in to single model. Open-source fallback (Llama, Qwen) |
| AI cost spiral — team uses expensive models for simple tasks |
LOW |
Default to cheapest adequate model. Monitor API costs weekly. Model selection guidelines |
| Security breach via AI — AI leaks proprietary code to provider |
MEDIUM |
Payment code = local models only. Cursor privacy mode. Enterprise agreements with providers |
| AI-generated tech debt — AI writes "works but ugly" code |
MEDIUM |
Style guides enforced via linter (not AI). Refactoring sprints every 3rd sprint. Code quality metrics |
| Prompt injection in legacy code — malicious strings in legacy DB/code trick AI |
LOW |
Sanitize inputs to AI. Don't pipe raw user data into prompts. Review AI output, don't auto-execute |
9. 30-60-90 Day AI Adoption Plan
Days 1-30: Foundation
Week 1: Cursor Pro licenses for all 5 engineers
Claude Code setup (CLI)
Ollama + Llama 3.3 + Qwen 2.5 Coder installed locally
CodeRabbit connected to repo
Week 2: Legacy codebase indexed in Cursor (@codebase working)
First prompt templates shipped (analyze, scaffold, migrate)
.cursorrules file configured for project conventions
Team workshop: "Agentic AI workflow for .NET migration"
Week 3: MCP connections: database, CI/CD, docs
Pilot migration: Communications module (AI-driven, measured)
Metrics dashboard live (AI code %, bug rate, cost)
Week 4: Pilot results reviewed. Prompts calibrated.
AI migration pipeline validated end-to-end.
Team confident: each engineer has done ≥ 1 AI-driven task.
Go/No-Go for AI-heavy approach.
Days 31-60: Acceleration
Week 5-6: Travel Booking migration (AI-driven)
React 18 page generation (GPT-4o from mockups)
Prompt library v2 (refined from pilot learnings)
Week 7-8: Event Management migration starts (reuse Travel templates)
AI-generated contract tests for all extracted services
First AI-powered monitoring alerts live
Days 61-90: Scale
Week 9-10: Reporting service (CQRS) — AI generates read models from legacy SQL
Workforce Management extraction starts
AI cost optimization: switch to cheaper models for proven tasks
Week 11-12: Event-driven architecture fully wired
AI anomaly detection live in staging
Team retrospective: what's working, what's not, adjust strategy