Penetration Testing

Zero-Day Vulnerabilities: How Penetration Testers Find What Scanners Miss

How experienced testers find unknown bugs in custom code — the business logic flaws, authorization gaps, and chained issues automated scanners cannot reason about.

Author
CyberGuards Security Research Team
Published
Updated
Read
11 min read

What "zero-day" actually means in a pentest report

In the press, "zero-day" usually conjures images of memory-corruption exploits in browsers and operating systems. In a typical pentest report, "zero-day" means something narrower and more useful: a novel vulnerability in your custom code or configuration that no scanner could match against a signature, because no signature for it exists.

The distinction matters because the most damaging issues in modern SaaS environments are almost never the dramatic memory-corruption kind. They are the boring-sounding kind: authorization that does the wrong thing under a specific role combination, a business workflow that lets a user upgrade themselves to admin, an API endpoint that ignores its tenant boundary under a particular request shape.

Why scanners cannot find them

Scanners work by matching observed responses against signatures of known issues. They are good at "this version of this library has CVE-2024-X" and bad at "this endpoint, called by this role, in this tenant, returns data the user should not see". Three reasons:

  • No signature exists. The bug is unique to your code and your business model. No prior advisory describes it.
  • Reasoning is required. The scanner has no concept of "owner", "tenant", "fee", or "subscription tier". It cannot infer that a result is wrong unless it knows what right would look like.
  • The bug emerges from a chain. Two findings, neither serious alone, combine into a critical outcome. Scanners report the items, not the relationship.

How testers find them — the unromantic version

The honest description of how a tester finds a zero-day in your application is that they read the application carefully, form hypotheses about how trust is enforced, and then test those hypotheses one by one. Three patterns dominate:

1. Trust-boundary hunting

Every application has trust boundaries: where unauthenticated input becomes authenticated, where one tenant's data hand-offs to another tenant, where a low-privilege call invokes a high-privilege subsystem. The tester maps these boundaries from the application's behavior, not from documentation, and probes the inconsistencies.

Example pattern: a SaaS reporting feature lets you generate a PDF for "your" data. The tester finds that the PDF generation runs as a service account with broader permissions than the calling user, and that the request parameters can be tampered to include other tenants' identifiers. The bug is not the PDF — it is the trust boundary between user-controlled input and the privileged worker.

2. Role-matrix differential

The tester walks every documented role through every protected resource. The interesting findings are the cells where behavior differs from what the role matrix says it should be — and the cells where the role matrix has nothing to say but the application has an opinion anyway.

Example pattern: a "viewer" role is documented as read-only. The tester finds that the viewer role can call an internal API that the SPA never invokes for viewers — but the backend never re-checks the role on that path. The bug is not in the documented features; it is in the undocumented surface.

3. Abuse-case construction

For each business workflow, the tester models how a determined adversary would abuse it. Coupons can be applied multiple times. Refunds can be issued to a different account than the one that paid. Free-trial accounts can be created at scale. These are not vulnerabilities in the OWASP sense — they are business-logic flaws — but they cost real money in real environments.

The categories we see most in real engagements

  • Authorization without re-checking. The frontend hides an action; the backend forgets to enforce it. Or middleware enforces a check at one layer but a deeper service-to-service call bypasses it.
  • Tenant boundary holes. Direct object access, exports, shared links, webhooks, search indexes. Each is a separate surface and each needs separate enforcement.
  • State-machine skipping. A workflow that should go A → B → C lets you go A → C and ends up in a state the system cannot reason about. Refunds, KYC steps, account upgrades, and order fulfillment all show this regularly.
  • Race conditions on money paths. Two concurrent withdrawals, two concurrent coupon applications, two concurrent role changes — the kind of issue that does not appear in single-threaded testing.
  • Trust-on-first-use weaknesses. Account-recovery flows, partner-integration handshakes, OAuth registration. Once trust is established, it is rarely re-evaluated.

Why this matters for the buyer

If you are evaluating pentest vendors, the question to ask is not "do they find vulnerabilities". Most vendors do, and most reports list a hundred low-severity items. The question to ask is whether the report contains findings that required reasoning about your specific application. That is the deliverable scanners cannot replace and that determines whether the engagement was worth the cost.

Practical filter: after reading a redacted sample report, ask yourself: would a competent automated scanner have found these? If the answer is yes for most findings, the engagement was a scanner-with-a-human-stamp. If the answer is no, you are looking at the right kind of work.

Preparing for your first pentest? Download the SMB Pentest Readiness Checklist →

FAQ

Zero-day pentesting — common questions

What does "zero-day" mean in a pentest context?

In pentest reports, "zero-day" usually means a previously unknown vulnerability discovered in custom application code or configuration — not a known CVE. It is the class of bug that a scanner cannot find because no signature exists for it yet.

How are zero-days different from known CVEs?

Known CVEs have published advisories, patches, and signatures that scanners detect. Zero-days are novel — discovered through manual exploration of an application or environment that no other researcher has examined this way before.

What kinds of zero-days do pentesters actually find?

In practice, most "zero-day" findings in custom apps are authorization flaws, business-logic abuses, and multi-step chains. Memory-corruption zero-days exist too but are less common in modern SaaS targets.

Do you disclose zero-days to vendors?

When findings affect third-party software, we coordinate disclosure with the vendor under a responsible-disclosure timeline. Findings in your own custom code are reported only to you.

Can scanners ever find zero-days?

Some scanners use generic heuristics (taint analysis, fuzzing) that occasionally surface novel issues, but the rate of true positives is low. The bulk of zero-day-class findings still require human exploration.

Want a credible answer when a customer, auditor, or your board asks how secure you are?

A quick scoping call with the senior tester who would run your engagement. No slides, no pitch — we look at what you have, tell you what we would test first, and give you a fixed scope, price, and date.