The EPSS Model

Motivation

Vulnerability remediation has some fundamental truths. First, there are too many vulnerabilities to fix them all immediately. Past research has shown that firms are able to fix between 5% and 20% of known vulnerabilities per month.1 Secondly, only a small subset (2%-5% of published vulnerabilities2) are ever seen to be exploited in the wild. These truths set up both the need and the justification for good prioritization techniques since firms both cannot and do not need to fix everything immediately.

Vulnerability remediation

Therefore, what is the optimal prioritization strategy when remediating vulnerabilities? Unfortunately, there is no one single answer to that question, but instead lies with a collection of metrics to help inform and guide the prioritization decisions, which is where this research comes in.

The Exploit Prediction Scoring System (EPSS) is a community-driven effort to combine descriptive information about vulnerabilities (CVEs) with evidence of actual exploitation in-the-wild. By collecting and analyzing these data, EPSS seeks to improve vulnerability prioritization by estimating the likelihood that a vulnerability will be exploited. The EPSS model produces a probability score between 0 and 1 (0 and 100%). The higher the score, the greater the probability that a vulnerability will be exploited.

Data Architecture and Sources

EPSS was developed in the summer of 2019 and was first presented at BlackHat in 2019. Since then, the SIG has been working hard to build a scalable computing infrastructure to ingest and process increasing multiple data sources. In fact, through community partnerships and the work of EPSS SIG members, EPSS is currently collecting the following kinds of data:

  1. Information that describes a vulnerability (descriptions, products, vendors, etc)
  2. Information about the vulnerability in the wild (prevalence, complexity, severity)
  3. Information about community reactions (social chatter, depth of discussions, exploits and tool publication)
  4. Ground-truth of exploitation in the wild (exploits in malware, IDS/IPS alerts, honeypot activity)

Model Results

EPSS scores are produced by estimating a logistic regression, which produces probabilities between 0 and 1 (0 and 100%). Details about the full model development are available in our research papers (see links below). In the end, the 16 variables shown below were found to be most significantly predictive of exploit. Notice how some relate to the software vendor and are positively correlated (e.g. Microsoft, IBM), while other variables relate to keywords found within the CVE record (e.g. code execution, denial of service, etc), and others refer to whether exploit code is publicly available (e.g. on ExploitDB).

Producing an individual score is easily computed using the equation shown above. In addition, a web calculator is available at https://www.kennaresearch.com/tools/epss-calculator/.

Model Performance

Consider the sample of CVEs used in EPSS,and the overlap between all CVEs rated as a CVSS 7 and above3). Next consider a remediation policy that seeks to patch all CVSS 7+ CVEs. In order to measure the quality of this approach, we need to need to consider the ground truth, which we do by tracking vulnerabilities we know have been exploited in the wild. This enables us to put each vulnerability remediation decision into one of four categories, as shown in the figure below.

Categories

As the figure above shows, the strategy to remediate based on CVSS 7+ produces many false positives and still leaves about half of the exploited vulnerabilities open and waiting to be remediated.

Using these four categories we can derive two more meaningful metrics, what we’ve termed the efficiency and coverage (or what information theory calls precision and recall respectively). Efficiency considers how efficiently resources were spent by measuring the percent of remediated vulnerabilities that were exploited. Remediating mostly exploited vulnerabilities would be a high efficiency rating (resources were allocated efficiently), while remediating random, mostly non-exploited vulnerabilities would result in a low efficiency rating. Efficiency is calculated as the number of exploited vulnerabilities prioritized (TP) divided by the total number of prioritized vulnerabilities (TP+FP). Coverage is the percent of exploited vulnerabilities that were remediated and is calculated as the number of exploited vulnerabilities prioritized (TP) divided by the total number of exploited vulnerabilities (TP + FN). Having low coverage indicates that not many of the exploited vulnerabilities were remediated with the given strategy.

Measuring and understanding both efficiency and coverage allows different firms to approach vulnerability remediation differently based on their risk tolerance and resource constraints. Firms that do not have many resources may wish to emphasize efficiency over coverage, attempting to get the best impact from the limited resources available. But for firms where resources are more plentiful and security is critical for success, the emphasis can be on getting high coverage of the highest risk vulnerabilities. With many companies probably striking a balance between the two.

For example, before EPSS, resource constrained firms may opt to only remediation vulnerabilities rated at CVSS 9 and above, while firms with more resources may focus on remediating CVSS 4 (or whatever) and above.

Precision and recall

We can measure the performance of EPSS and use CVSS version 3.0 (base score) as a reference point. Both standards produce values in a range (EPSS produces a score between 0 and 1 while CVSS produces a score between 0 and 10), so depending on where a cut-off is established, the coverage and efficiency will vary.

The plot shown here identifies red points for various cut-off points for CVSS, while the blue line represents the sliding cut-off points available to EPSS. The highest cut-off points for both start on the left and the points/line move to the right as that cutoff point is lowered.

One way to read the plot is that if we want to match the coverage for CVSS 9+ (meaning the same amount of exploited vulnerabilities remediated) with EPSS, we start by finding CVSS 9+ on the chart. It’s rated around 25% coverage and 9% efficiency (25% of exploited vulnerabilities are remediated and 9% of the vulnerabilities we remediated were actually exploited, and 91% of the remediated vulnerabilities may have safely been delayed). If we go straight up from CVSS 9+ to the line representing EPSS we see at 25% coverage, the efficiency of EPSS is about 37%. Indicating that a firm will be able to reduce the same amount of exposure as CVSS 10+ but do so much more efficiently. The following plot shows the difference visually. Notice how what is remediated with EPSS is a much smaller amount, meaning savings in resources.

Figure 4

But let’s take it the other way with CVSS 9+, let’s use the same amount of resources but look at the improvement in coverage. If we allow the efficiency to drop to 9%, with EPSS we would be getting coverage somewhere about 85%, so roughly for the same amount of tolerance in efficiency, we are able to close off a lot more vulnerabilities that are known to be exploited.

EPSS References

Peer-Reviewed Papers

Presentations


  1. Kenna security and P2P reference ^
  2. Other references here ^
  3. As rated by the National Vulnerability Database ^