Posts

VulnEval: A NIST Risk Model for Vulnerability Prioritization

VulnEval: A NIST Risk Model for Vulnerability Prioritization Feature

A vulnerability prioritization model that separates impact and likelihood using CVSS subscores, EPSS, and KEV signals.

Nicholas Molina
March 10, 2026

CVSS Is Not a Risk Score

Modern vulnerability management is built around scanners. Vulnerability scanners find CVEs and practitioners immediately sort them by CVSS score. Dashboards follow the score. Remediation SLAs follow the score. The highest numbers rise to the top.

The problem is that CVSS measures technical severity, not risk. Even if you use CVSS by the book and factor in the perennially neglected Temporal and Environmental components, the model remains dominated by impact. But risk requires both impact and likelihood. When we prioritize remediation using CVSS, we are prioritizing on only half of the risk equation.

This observation is not obscure or novel. Many practitioners have pointed it out, including NIST in 2014: “CVSS scores should not be the sole factor when determining risk”.

FIRST, the organization that maintains CVSS, addressed the “likelihood gap” publicly in 2021 with EPSS (Exploit Prediction Scoring System), a model estimating the probability that a vulnerability will be exploited in the wild within the next 30 days. CISA published the Known Exploited Vulnerabilities (KEV) catalog that same year, which identifies vulnerabilities confirmed to be actively exploited in the wild.

Unfortunately, most tooling omits EPSS and KEV altogether, let alone incorporates them into a risk calculation. The result is predictable: CVSS dominates and high-impact vulnerabilities with low real-world likelihood of exploitation are escalated aggressively, while lower-impact vulnerabilities that are actively exploited remain ignored.

A Better Way

Recognizing that risk is not a function of impact alone but of impact and likelihood, I built a tool that attempts to measure CVE risk more explicitly. Essentially it combines CVSS, EPSS, and CISA KEV using the NIST risk framework by mapping them to the framework’s inputs: Impact and Overall Likelihood (Likelihood of Adverse Effect by Likelihood of Occurrence).

A NIST-aligned vulnerability prioritization model that separates impact and likelihood using CVSS subscores, EPSS probability, and KEV signals.

Separating and Merging Signals

The CVSS score most vulnerability scanners provide is the CVSS Base score, and it is not a single signal. It is calculated from two underlying subscores: Impact and Exploitability. Impact reflects the potential damage to confidentiality, integrity, and availability if the vulnerability is successfully exploited. Exploitability reflects how technically feasible exploitation would be based on factors such as attack vector, attack complexity, required privileges, and user interaction.

From a NIST risk perspective, the CVSS Impact subscore maps directly to Impact, while Exploitability maps to Likelihood of Adverse Effect. The missing piece, Likelihood of Occurrence, is filled by EPSS.

  • Impact — CVSS Impact subscore
  • Technical Exploitability (Likelihood of Adverse Effect) — CVSS Exploitability subscore
  • Exploitation Probability (Likelihood of Occurrence) — EPSS

To make the signals operationally usable, they are normalized into qualitative bins aligned to NIST risk tables.

CVSS Impact Subscore → Impact

Impact Subscore Label
< 1.179 Very Low
≥ 1.179 and < 2.359 Low
≥ 2.359 and < 4.173 Moderate
≥ 4.173 and < 5.382 High
≥ 5.382 Very High

CVSS Exploitability Subscore → Likelihood of Adverse Effect

Exploitability Subscore Label
< 0.758 Very Low
≥ 0.758 and < 1.516 Low
≥ 1.516 and < 2.682 Moderate
≥ 2.682 and < 3.459 High
≥ 3.459 Very High

The CVSS subscore bins are proportionally derived from FIRST’s CVSS Base severity bins and applied to the numeric ranges of the subscores. CVSS Critical severity maps to Very High, while the CVSS Low bin is subdivided to introduce a Very Low category for additional granularity.

EPSS → Likelihood of Occurrence

EPSS Likelihood
≥ 0.50 Very High
≥ 0.30 and < 0.50 High
≥ 0.10 and < 0.30 Moderate
≥ 0.03 and < 0.10 Low
< 0.03 Very Low

The EPSS bin thresholds reflect the heavy-tailed distribution of exploitation probability, where most vulnerabilities cluster near zero. To preserve meaningful discrimination, the thresholds concentrate resolution in the upper ranges of the distribution. In practical terms, this maps a CVE to Very High likelihood when EPSS estimates the probability of exploitation in the next 30 days at 50% or greater, with lower bins scaling downward from there.

Deriving Overall Likelihood and CISA KEV

Overall Likelihood has two inputs:

  • Probability that exploitation will occur in the next 30 days (EPSS), which serves as a proxy for likelihood of occurrence.
  • How technically feasible exploitation would be (CVSS Exploitability), which serves as a proxy for likelihood of adverse effect.

These are combined using a NIST-style matrix to produce an Overall Likelihood rating.

CISA KEV is used as an override. If a CVE appears in CISA’s Known Exploited Vulnerabilities catalog, its overall likelihood is set to Very High. The reasoning is if exploitation is confirmed in the wild, that signal should take precedence over theoretical modeling.

Calculating the Risk Result

Final technical risk is determined by mapping Impact against Overall Likelihood in a NIST risk table. While the model produces a single qualitative result, each component is presented transparently and the transformation logic is explicit rather than hidden inside an opaque score.

What VulnEval Is — and What It Is Not

  • The VulnEval model evaluates technical vulnerability risk signals. It does not account for asset criticality, exposure context, or compensating controls. Those factors are unique to each organization and should be considered when prioritizing.

  • Binning reduces numerical precision in exchange for interpretability. That tradeoff is deliberate.

  • “All models are wrong, some models are useful.” — George Box. This model is useful in that it produces more defensible prioritization decisions than sorting by CVSS alone.

Practitioners reach for CVSS because it is there. VulnEval won’t replace organizational judgment. It stops CVSS from substituting for it.

Tags
See Also