Challenge difficulty: Difference between revisions

From Project-GC
Jump to navigation Jump to search
m Fixed the example, some numbers were swapped.
Refactored (and corrected) the math
 
Line 3: Line 3:


==How is it calculated?==
==How is it calculated?==
Basically the ''Challenge difficulty'' tells us what proportion of [[Geocacher]]s fulfill a challenge or not. The rating is an integer between 0 and 100, but should not be confused with percent. First off it's important to understand that the automatic challenge checker runs aren't automatically executed on the average [[Geocacher]], but more likely on the more hardcore [[Geocacher]]s. It's primarily executed on [[Geocacher]]s with more finds, regular users of [[Project-GC]] and also [[Paid membership|paid members]] of the site. With this in mind we know that the result isn't representing the average geocacher, on the other hand, challenges generally aren't targeting the average geocacher either.


The [[Challenge checker]] results are dividing into three categories:
Basically the ''Challenge difficulty'' tells us what proportion of [[Geocacher]]s fulfill a [[Challenge]] or not. The rating is an integer between 0 and 100, where higher means harder, but should not be confused with percent.
* Region - [[Geocacher]]s that are determined to live in the same '''region''' as the [[Challenge]] itself.
* Country - [[Geocacher]]s that are determined to live in the same '''country''' as the [[Challenge]] itself, but not in the same region.
* World - All others


For each category a percentage of how many fulfill the [[Challenge]] is calculated. Then the results for each group is weighted different in the calculations, basically a list is created and the percentages are added into that list. A fourth factor is also added into the mix, the number of finds per day.
First off it's important to understand that the automatic [[Challenge checker]] runs aren't automatically executed on the average [[Geocacher]], but more likely on the more hardcore [[Geocacher]]s. It's primarily executed on [[Geocacher]]s with more finds, regular users of [[Project-GC]] and also [[Paid membership|paid members]] of the site. With this in mind we know that the result isn't representing the average geocacher, on the other hand, challenges generally aren't targeting the average geocacher either.


The number of finds per day isn't used as the raw number itself, but rather as '(1-($numFinds/$daysOld))*100'.
For each [[Challenge]] we look at the [[Challenge checker]] runs across three nested groups of [[Geocacher]]s, based on where [[Project-GC]] believes each [[Geocacher]] lives (typically inferred from their finds - we don't actually know):
* '''Region''' - [[Geocacher]]s determined to live in the same region as the [[Challenge]].
* '''Country''' - [[Geocacher]]s determined to live in the same country as the [[Challenge]] (this also includes those counted under Region).
* '''World''' - all [[Geocacher]]s (this also includes those counted under Country and Region).


Then the difficulty is calculated as '$difficulty = (3*$worldPercent + 5*$countryPercent + 10*$regionPercent + $findsPerDay) / (3+5+10+1)'.
For each group we calculate the ''fail rate'' - the share of runs that did not fulfill the [[Challenge]]. A high regional fail rate means the [[Challenge]] is hard for locals.


It should also be noted that there usually are more calculated [[challenge checker]] results in the system from more local [[Geocacher]]s than from those living far away.
A fourth factor folds in how rarely the [[Challenge]] is actually found, expressed as '(1 - $numFinds / $daysOld) * 100'. It is 0 when the [[Challenge]] is found daily and approaches 100 when it is rarely found.
 
The four components are combined with weights - more local counts more:
 
  $difficulty = (3*$worldFailPercent + 5*$countryFailPercent + 10*$regionFailPercent + $findsPerDayFactor) / (3+5+10+1)
 
The result is rounded down to the closest integer.


===Example===
===Example===
* The Challenge is placed in Texas, United States.
 
* 80% of the [[Geocacher]]s in Texas fulfill it.
A [[Challenge]] in Texas, United States, was published 365 days ago and has had 10 finds since. The checker results show:
* 40% of the [[Geocacher]]s in the other states of the United States fulfill it.
* 20% of [[Geocacher]]s in Texas fulfill it - regional fail rate 80%.
* 30% of the [[Geocacher]]s in the rest of the world fulfill it.
* 60% of [[Geocacher]]s in the United States fulfill it (Texas geocachers included) - country fail rate 40%.
* The challenge is one year old (365 days) and has had 10 finds during this period.
* 70% of all [[Geocacher]]s with checker runs fulfill it (US geocachers included) - world fail rate 30%.
* Then to weigh these differently using coefficients a sum is created like this: 3*30+5*40+10*80+((1-(10/365))*100).
 
* This sum is then divided by 19 (3+5+10+1, the last being related to findsPerDay).
Plugged in:
* The ''Challenge difficulty'' ends up being 1187.26/19 = 62.487, which is rounded down to 62. The values are always rounded down to the closest integer.
* finds-per-day factor = (1 - 10/365) * 100 = 97.260
* difficulty = (3*30 + 5*40 + 10*80 + 97.260) / 19 = 1187.260 / 19 = 62.487
* Rounded down: '''62'''.


==Related statistics==
==Related statistics==

Latest revision as of 07:55, 19 May 2026

What is it?

For most Challenge checkers in the system there is a Challenge difficulty calculated. The difficulty is based on on how many Geocachers fulfill the challenge or not. Project-GC is running Challenge checkers in the background for tens of thousands of Geocachers, this process is a part of the Auto-Challenge-Checker System. The result of these runs is used to calculate the Challenge difficulty.

How is it calculated?

Basically the Challenge difficulty tells us what proportion of Geocachers fulfill a Challenge or not. The rating is an integer between 0 and 100, where higher means harder, but should not be confused with percent.

First off it's important to understand that the automatic Challenge checker runs aren't automatically executed on the average Geocacher, but more likely on the more hardcore Geocachers. It's primarily executed on Geocachers with more finds, regular users of Project-GC and also paid members of the site. With this in mind we know that the result isn't representing the average geocacher, on the other hand, challenges generally aren't targeting the average geocacher either.

For each Challenge we look at the Challenge checker runs across three nested groups of Geocachers, based on where Project-GC believes each Geocacher lives (typically inferred from their finds - we don't actually know):

  • Region - Geocachers determined to live in the same region as the Challenge.
  • Country - Geocachers determined to live in the same country as the Challenge (this also includes those counted under Region).
  • World - all Geocachers (this also includes those counted under Country and Region).

For each group we calculate the fail rate - the share of runs that did not fulfill the Challenge. A high regional fail rate means the Challenge is hard for locals.

A fourth factor folds in how rarely the Challenge is actually found, expressed as '(1 - $numFinds / $daysOld) * 100'. It is 0 when the Challenge is found daily and approaches 100 when it is rarely found.

The four components are combined with weights - more local counts more:

 $difficulty = (3*$worldFailPercent + 5*$countryFailPercent + 10*$regionFailPercent + $findsPerDayFactor) / (3+5+10+1)

The result is rounded down to the closest integer.

Example

A Challenge in Texas, United States, was published 365 days ago and has had 10 finds since. The checker results show:

  • 20% of Geocachers in Texas fulfill it - regional fail rate 80%.
  • 60% of Geocachers in the United States fulfill it (Texas geocachers included) - country fail rate 40%.
  • 70% of all Geocachers with checker runs fulfill it (US geocachers included) - world fail rate 30%.

Plugged in:

  • finds-per-day factor = (1 - 10/365) * 100 = 97.260
  • difficulty = (3*30 + 5*40 + 10*80 + 97.260) / 19 = 1187.260 / 19 = 62.487
  • Rounded down: 62.

Related statistics

Challenge difficulty per rating interval shows the correlation between the difficulty rating of the geocache and the challenge difficulty.

The challenge tab in the profile stats has a module for challenge difficulties.