AI-Generated Code Is Growing Your Attack Surface—Retrofit DAST + API Discovery Gates Without Slowing CI/CD
AI-assisted coding is accelerating merges faster than most teams can validate them—and the result is a quietly expanding attack surface. This post outlines a practical way to add DAST and agent/API discovery gates to an existing CI/CD pipeline so modernization velocity doesn’t become long-lived security debt.
AI-assisted development has changed the shape of risk. Code arrives faster, in larger batches, and often with fewer humans deeply understanding the edges. If your CI/CD pipeline was built for “human-authored diffs,” you’re likely shipping new routes, handlers, and third-party calls that never hit meaningful security validation.
Snyk recently summarized the dynamic well: delivery speed has outpaced validation, and 62% of LLM-generated code tested as insecure. Add AI agents that call undocumented APIs, and you end up with gaps that legacy security tools—and even mature test suites—can miss. The good news: you don’t need to halt delivery to catch up. You can retrofit testing gates that focus on what AI changes tend to break: exposure, integration drift, and unobserved surface area.
Context: why AI-generated changes expand risk in maintenance and modernization work

Maintenance and modernization teams are uniquely exposed. You’re often:
- Updating frameworks and runtime versions
- Refactoring monoliths into services
- Wrapping legacy systems with new APIs
- Merging AI-authored patches into older code that lacks comprehensive tests
That combination makes it easy to ship functionality that “works” in happy paths but quietly introduces:
- New endpoints (sometimes without auth parity)
- Weak input validation or inconsistent encoding
- New dependency chains
- Calls to internal or legacy admin APIs that were never meant for general use
Snyk’s piece, “AI Is Building Your Attack Surface. Are You Testing It?”, frames the core problem: AI increases throughput, but it also increases the volume of changes that can create exploitable behaviors. Worse, AI agents can interact with systems through “tribal knowledge” endpoints—things nobody documented because they were never intended to last.
At the same time, the broader threat landscape isn’t slowing down. Botnets, supply chain attacks, and rapidly weaponized public vulnerabilities (as covered across security reporting from outlets like KrebsOnSecurity and BleepingComputer) thrive on exactly these gaps: exposed services, stale dependencies, and misconfigured interfaces.
Main analysis: where legacy gates fall short with LLMs and agents
Most pipelines already have some combination of unit tests, SAST, dependency scanning, and maybe container scanning. Those are necessary—but AI introduces patterns that can slip through.
1) “It compiles” is not validation
LLM-generated code is often syntactically correct and plausibly structured. That makes it easy to accept changes based on green builds alone. But security failures tend to be semantic:
- Missing authorization checks on a newly added route
- SSRF via a “convenience” URL fetch helper
- IDOR from overly broad resource lookups
- Deserialization hazards in “quick” integrations
These issues frequently don’t show up in unit tests unless you already had security-focused test cases.
2) SAST and dependency scanning don’t see runtime exposure
Static tooling is great at known classes of mistakes and vulnerable packages, but it doesn’t reliably answer:
- What endpoints are now reachable from the internet?
- What parameters can be influenced by untrusted callers?
- What auth flows can be bypassed due to proxy/header behavior?
That’s why Snyk emphasizes that speed has outpaced validation and highlights the insecure rate observed in testing LLM-generated code.
3) AI agents can create “shadow integrations” via undocumented APIs
Modern dev teams increasingly run agents that:
- Open PRs based on issue text
- Call internal services to retrieve context
- Stitch together workflows across SaaS and internal APIs
Agents often “discover” APIs by reading code, logs, or examples—and may call endpoints that aren’t in your OpenAPI specs, Postman collections, or official docs. Those calls can become production dependencies. If your security gates only validate documented surfaces, you’re blind to what the agent actually used.
The retrofit strategy: DAST + agent/API discovery as CI/CD gates
You don’t need a massive re-platform to fix this. The practical goal is:
- Detect new or changed runtime surface area (API discovery)
- Exercise it like an attacker would (DAST)
- Gate releases based on risk and change scope, not by making every build painfully slow
Below is a pattern that works well for maintenance and modernization teams because it’s additive: you can layer it onto existing CI/CD.
Step 1: Add an ephemeral “scan environment” stage
Why it matters
DAST needs something running. The fastest way to do this without slowing delivery is to standardize a short-lived environment per build or per release candidate.
Implementation pattern
- In CI, build an artifact (container image or deployable package).
- Deploy it to an ephemeral environment (Kubernetes namespace, ephemeral VM, preview environment).
- Seed minimal test data.
- Run smoke tests first to confirm the environment is stable.
This is modernization-friendly: you can do it even if the app is legacy, as long as you can stand up a runnable instance (often via containers, even if production isn’t fully containerized yet).
Step 2: Perform API discovery focused on “what changed”
What discovery should do
API discovery is not just inventory for compliance—it’s a change detector for attack surface.
At minimum, you want to answer:
- Did this PR introduce new routes?
- Did it change HTTP methods, auth requirements, or parameter shapes?
- Did it add new outbound calls to internal services?
Practical approaches
- Traffic-based discovery: run your integration tests (or basic scripted flows) while capturing API calls and endpoints reached.
- Spec diffing: if you have OpenAPI, diff generated specs vs baseline. If you don’t, consider generating a “best-effort” spec from routing metadata.
- Agent-aware discovery: if AI agents run workflows that hit internal APIs, capture and track those calls as first-class interfaces.
The key is to treat “undocumented but used” APIs as real. Snyk’s warning about agents using undocumented APIs is exactly the scenario where teams later discover they’ve been depending on—and exposing—interfaces nobody reviews.
Gate recommendation
Fail (or require approval) when:
- A new public-facing endpoint appears without an associated auth policy
- A new endpoint has no tests and no documented owner
- Undocumented endpoints are called from automation/agent workflows without being registered
This is a lightweight governance layer that doesn’t require a full documentation overhaul on day one.
Step 3: Run DAST in two tiers (fast PR checks + deep nightly)
DAST has a reputation for being slow. The trick is to scope it.
Tier A: “PR DAST” (fast, scoped)
Run on pull requests that:
- Add routes/controllers n- Touch auth/session code
- Change input validation/serialization
- Modify API gateway/proxy config
Configuration:
- Scan only the endpoints discovered in Step 2 (delta scan)
- Time-box to a small budget (e.g., 5–10 minutes)
- Fail only on high-confidence, high-severity issues (auth bypass, injection, exposed admin endpoints)
Tier B: “Release/ nightly DAST” (deep, broad)
Run daily or on release candidates:
- Full crawl of the application
- More aggressive tests and longer runtime
- Broader rule set (including medium severity)
This preserves delivery speed while still giving you comprehensive coverage.
Step 4: Add “risk-based gating” rather than “one gate for everything”
To avoid slowing delivery, gates should be proportional to risk.
A practical policy:
- Low-risk changes (docs, UI text, refactors with no route changes): existing unit tests + SAST/deps only
- Medium-risk changes (business logic, data models): add API discovery + PR DAST time-boxed
- High-risk changes (auth, routing, file upload, deserialization, gateway rules): require PR DAST + manual security review and deeper scans
This is especially important in maintenance programs where many PRs are routine (dependency bumps, small compatibility edits) and shouldn’t be treated like brand-new product work.
Step 5: Make results actionable (and hard to ignore)
Security gates fail when they become noisy. Improve signal with:
- Deduplication and baselining: don’t re-alert on known accepted risks every build; track them explicitly.
- Ownership mapping: every endpoint/service needs an owner (team or system).
- Fix guidance: link findings to specific code lines, commits, or endpoint diffs.
This reduces the “tool fatigue” that causes teams to bypass gates.
Practical implications for engineering teams and CTOs
For developers: shift left without shifting pain
Developers don’t want more process—they want faster feedback.
- Keep PR DAST short and scoped.
- Trigger DAST intelligently (only when surface changes).
- Provide a clear “what changed” API diff in the PR.
This turns security into a tight feedback loop rather than a release-blocking surprise.
For platform/DevOps teams: treat discovery as a first-class artifact
Make “API inventory + diff” a build artifact alongside SBOMs and test reports. Over time, this becomes an operational map of your modernization journey:
- Which legacy modules still expose endpoints?
- Which services are growing in complexity?
- Where do AI agents interact with internal systems?
For CTOs: avoid compounding security debt during modernization
Modernization already carries risk (framework upgrades, architectural changes). If AI-generated code increases throughput without equivalent validation, you effectively finance speed with future incident cost.
Recent real-world reporting on botnets, malicious packages, and quickly exploited vulnerabilities underscores the practical stakes: attackers capitalize on exposure and inconsistency. The Snyk statistic—62% of LLM-generated code testing as insecure—isn’t a reason to ban AI; it’s a reason to ensure your pipeline catches the kinds of mistakes AI commonly introduces.
Conclusion: modernize your gates the same way you modernize your stack
AI will keep accelerating delivery. The question is whether your validation system evolves with it.
Retrofitting DAST plus agent/API discovery into CI/CD is a pragmatic upgrade strategy: you don’t need to rewrite your app or slow teams down, but you do need to start measuring what’s actually being exposed and exercised at runtime. As maintenance and modernization teams merge more AI-authored changes into legacy systems, these gates become the difference between sustainable velocity and security debt that lingers for years.
Next step: start with one service, stand up an ephemeral scan environment, generate an API diff per PR, and add a time-boxed PR DAST scan for surface-changing changes. Within a sprint or two, you’ll have meaningfully reduced your unknown exposure—without giving up speed.