Repo Watch

27 Mar 2026

Repo Watch week one: what I built

A timeline of everything shipped in the first eight days — from initial commit to complexity risk scoring and dark mode.

Eight days. One product. Here is how it went.

The starting point

A few days before the first commit, on 17 March, I was scrolling LinkedIn over coffee and kept landing in the same conversation — just in different forms. Developers debating whether to trust AI-generated code in review. Agilists frustrated that delivery output had doubled but quality signals had not kept up. CTOs asking how to manage the risk of engineers quietly building on tools the organisation had no visibility over. I closed the laptop and caught up with some colleagues over the next week. We talked about agile software delivery, about what it looks like when a large organisation of people is trying to adapt to AI versus a smaller autonomous group that can just make decisions and move. The gap felt real and growing.

Shortly after discussions and reading I sat down with a clear idea: a tool that gives teams immediate visibility into what is actually in their repositories — not a slow audit, not an expensive consultant or AppSec reporting tool, just a scored report in minutes that doesn't need to be connected to a repository.

The first commit landed on 22 March 2026. Not a scaffold — a checkpoint of real working software: auth, scanning, scoring, billing, worker, and the CI pipeline to deploy it all. The goal from day one was to ship something real, not a demo.

Five days from that coffee to the first commit. Nine days to a live product with billing, scanning, and a CI pipeline deployed to GCP. That kind of pace is only possible when AI tooling closes the gaps that would normally slow you down — boilerplate, scaffolding, research, wiring up services you have configured a dozen times before. The experience to know what to build, and in what order, still matters. But the time it takes to actually build it has collapsed. That is genuinely exciting. It is also exactly why Repo Watch exists: when everyone on your team can ship at this speed, what actually got shipped becomes a more important question, not a less important one.

What followed was a week of hardening, closing gaps, and shipping the first two versioned releases.

v0.1.0

As a developer — whether I wrote it line by line or shipped it with AI — I want to connect my repository and get a scored health report in minutes, so I can surface the key areas worth looking at without reading every file.

The foundation: OAuth sign-in, ZIP upload with archive validation, Semgrep static analysis, Gitleaks secret detection, a scoring engine with section breakdowns and explainers, Stripe billing plans, and PDF export.

Scan Results View

Typical v0.1.0 result

?

Code Quality

Score 78

Moderate dependency footprint. One nested-loop signal.

Test Confidence

Score 65

Structural test coverage present. Happy-path bias.

Security Hygiene

Score 74

Two medium findings from Semgrep. No secrets detected.

AI-Risk Indicators

Score 80

No high-confidence AI-risk signals.

Scores and notes are illustrative. Actual output varies by repository.

v0.2.0: complexity risk scoring

As a developer reviewing inherited code, I want to see which patterns will become performance bottlenecks as traffic grows, so I can prioritise refactoring before it becomes a production incident.

The big feature in 0.2.0 was complexity risk as a scored section.

The idea is simple: static analysis can flag code patterns that will slow down as data or traffic grows. A nested loop over a database result set is not a bug today. It becomes a problem at 10× the current load. The complexity risk section surfaces those patterns before that happens.

Each scan now runs a set of Semgrep rules targeting:

  • Nested loops (O(n²) growth risk)
  • Nested functional iteration (.map inside .forEach, and similar)
  • Linear search inside loops (find/filter/includes in loop context)
  • Repeated in-loop sorting
  • Network or database calls inside loops — the classic N+1 pattern

These patterns show up disproportionately in AI-assisted codebases — not because AI tools write bad code, but because they have no view of data volumes, traffic expectations, or what indexes already exist. They solve the immediate problem correctly and move on. The result is code that looks fine in review and falls over at 10× load. That is not a criticism of using AI tooling; it is just a consequence of how code generation works today, and it is exactly the kind of gap a static scan can surface before it reaches production.

Findings are charted on a scatter plot where the x-axis represents how performance is likely to scale (from O(1) to O(n³+)) and the y-axis shows estimated risk. Points to the right and up are the ones worth reviewing first.

When a pattern is not flagged, that absence is surfaced as a positive signal — so you can see what is working, not just what isn't.

Complexity risk graph

Complexity findings and positive signals

3 findings · 2 positive

Each point represents a code pattern that may slow down as data or traffic grows. The x-axis shows how performance is likely to scale, and the y-axis shows estimated risk. These are static analysis signals, not measured performance data.

Overall score impact: section score 58 at 10% weight contributes about 6 points to the overall score.
constslow↑linearmod↑fast↑very fast↑riskestimated runtime riskhow performance scales →

Nested loop over query resultGrows fast · risk 78

A for…of loop iterates over a database result set, and a second for…of loop is nested inside it. At moderate data volumes this causes significant performance degradation.

Improvement

Pre-index the inner collection with a Map or Set before the outer loop. Reduce to a single pass.

for (const repo of repos) {
  for (const scan of scans) {
    if (scan.repoId === repo.id) { ... }
  }
}

The graph and findings above are illustrative examples, not real scan output.

The section contributes 10% to the overall score. Each finding includes plain-English guidance: what the pattern is, why it matters at scale, how to fix it, and the relevant code snippet.

Dark mode

Oh, and dark mode shipped too. For the vibe coders — you know who you are.


Eight days in. Two releases shipped. The scan pipeline is real, the billing is live, and the highest-impact new section is now scoring production repositories.

If your team is moving fast — and most teams are right now — the answer is not to slow them down. Give them a space to build, let them experiment, and use a tool like this to see what they are actually shipping. You get visibility into the patterns worth reviewing without dismantling the creativity and momentum that is driving your business forward. That is the balance worth finding.

Run your first scan

Sign in for 3 free scans a month. Paid plans unlock more scans, connected repositories, and priority processing.

No credit card required. Connect a GitHub repository or upload a ZIP to start.

Back to blog

AI-risk indicators are signals, not proof.