×

To be able to write in the forum you need to authenticate. Meanwhile it's read-only.

Re: Checker for highest/lowest caches of countries/regions/county

[Resolved] Checker for highest/lowest caches of countries/regions/county
August 17, 2023 08:54PM
Requesting a checker script that would enable checking all of the following:
- # of countries with the country's highest elevation caches found (e.g. 3 - 6 - 12)
- # of countries with the country's lowest elevation caches found (e.g. 3 - 6 - 12)
- # of regions with the region's highest elevation caches found (e.g. 5 - 10 - 20) (optionally, only from given country / countires)
- # of regions with the region's lowest elevation caches found (e.g. 5 - 10 - 20) (optionally, only from given country / countires)
- # of counties with the county's highest elevation caches found (e.g. 10 - 20 - 50) (optionally, only from given country / countires ++ optionally only from given county+region/regions)
- # of counties with the country's lowest elevation caches found (e.g. 10 - 20 - 50) (optionally, only from given country / countires ++ optionally only from given county+region/regions)

OBS - I stumbled upon this script https://project-gc.com/Tools/Challenges?edit&scriptId=7785&addTag "Generic Highest in N Regions" (by sloth96) which I thought I could modify to perform the above by forking it (my forked & edited version -> https://project-gc.com/Tools/Challenges?edit&tagId=80190) and modifying some of its contents, and I got it to the point where regions or counties and highest or lowest settings can be parameterized.

However, after fixing also few bugs (empty/missing regions/counties) I still noted some errors I couldn't catch and what was even worse I stumbled upon performance issues (several timeouts) while testing the execution of the script with rather simple parameters (one country only) and therefore I'm kindly requesting if there would be anyone interested in rewriting the whole code with potentially a better algorithm that would avoid consistent timeouts.

Note: the condition for these challenges mentioned above have been checked through reviewers and they are allowed, and I have gccodes reserved for these.



Edited 2 time(s). Last edit at 08/17/2023 08:56PM by sm07. (view changes)
Re: Checker for highest/lowest caches of countries/regions/county
August 18, 2023 06:42AM
I am not surprised that you are encountering timeouts. Each call to GetHighestCaches is a database query; an expensive one at that. For example, for "highest caches in N countries", the script is:

(1) Finding all countries the user has found at least one cache in, it's not worth checking the others.
(2) For each one of those countries, querying the database for the highest caches in that country.

The database likely has various indexes on countries and elevations, which allows it to be smart about how it queries for data. It sounds like Project-GC's programming is smart too, since the documentation for GetHighestCaches suggests that it is quickselecting for the `limit`th-highest elevation, and then doing a query for everything higher than that. But even still, it is slow: just now I used Project-GC's Top elevation caches to find the highest elevation caches in various countries and sometimes waited up to 5 or 10 seconds for the query to return -- and that is as a paying member receiving priority in the queue.

The speed will also depend on the number of countries/regions/counties involved in the test: for example, even though you have only found 1 geocache in Albania, the script will query for the highest cache in Albania just in case the one you have found happens to be the highest.

Nonetheless I will look at your script to see if there is any places where the script may be the culprit.
Re: Checker for highest/lowest caches of countries/regions/county
August 18, 2023 06:58AM
(1) The best advice will come from sloth96, who wrote the original script. I can roughly follow what it is doing but it is not really how I would write it.

(2) That said, I can tell that the script is being smarter than I thought. It is combining regions together when it can, and trying to do combined queries. Very nice.

However, I think "cachesbuffer = 2000" is excessive. It doesn't change with the size of the queried region, either. This means that even if you are only querying for caches in XXX county, it still asks the database for the top 2000 elevation caches in that county. That is probably nearly all the caches in that county. I would suggest 500 for countries, 200 for regions, and 50 for counties.

(3) It may be worth wrapping each call to GetHighestCaches with a:

start = os.time()
GetHighestCaches(...)
end = os.time()
PGC.print("Total query time:", end - start)

to get the total database query time may help narrow down problems. Of course, this will fluctuate depending on the time of day, mainly if Europe/America is awake or not. But it may help determine if my hypothesis in the previous post is correct, or if it is just up to the script to be faster. When I wrote my script "Find oldest/most favourited cache matching N of M filter groups" I did this. Several of the tags query for the oldest cache in 49 separate counties, and it takes about half a second per county.

(4) In the function GetTop you set "param['filter']['excludeArchived'] = true". I would not recommend this, since it means that if a top cache gets archived then all those who found it will no longer qualify. When it comes to similar challenges with the "oldest" and "top favourited" challenges, Reviewers generally require that any finds on archived caches that are better than the current active best (ie. older, or more favourites) must be accepted for qualifications.

(5) I am out of scriptwriting time at the moment, but may take this up sometime later if no other scriptwriters do.
Re: Checker for highest/lowest caches of countries/regions/county
August 19, 2023 04:57PM
Hi,
thanks for comments and your brief analysis.

> However, I think "cachesbuffer = 2000" is excessive. It doesn't change with the size of the queried region, either. This means that even if you are only querying for caches in XXX county, it still asks the database for the top 2000 elevation caches in that county. That is probably nearly all the caches in that county. I would suggest 500 for countries, 200 for regions, and 50 for counties.

Changed cachesbuffer to 500 for the time being. Will need to check for splitting it as per type later.

>(3) It may be worth wrapping each call to GetHighestCaches with a:
Did this as well. With few tests, all the calls seem to run within <=1 (milli?seconds) which is surprising.

>(4) In the function GetTop you set "param['filter']['excludeArchived'] = true". I would not recommend this, since it means that if a top cache gets archived then all those who found it will no longer qualify.
Agree. However, I'm not sure if this is some perf. optimization done by original script write, as in all tags I've set (and intend to use) excludeArchived = false, and based on my current understanding it seems that the script does take care of the checking of archived founds lated on despite this row.

(5) I am out of scriptwriting time at the moment, but may take this up sometime later if no other scriptwriters do.
Sure, thanks.
Re: Checker for highest/lowest caches of countries/regions/county
August 24, 2023 07:34PM
(example: ) https://project-gc.com/Challenges//80440

I made a mod to my multi-test checker for a similar checker recently, and it looks like this can be handled by that change, by just adding a function for highest/lowest elevated caches, which I modelled after my oldest/favorite function. This checker checks you've found 5/10 highest elevated caches in at least 2 countries (you pass sm07!). You could customize it to regions or counties, instead of countries easily, as well as change the numbers considered/required.

It calls GetHighestCaches/GetLowestCaches with each "thing" you have a find in. It looks like the database queries for a small number of results (eg. 10 in this case), and a smallish number of countries (I tried somebody with 81 different countries), took 20s, so it's unlikely somebody out there would timeout. Of course, if you tried to do 5/10 in N counties (without limiting it by country/region), you'd get lots of timeouts, because the number would be astronomical for some people. It could potentially be optimized to sort by number of finds before doing the queries, knowing that if you don't have at least the minimum number of finds in an area, then there's no point in doing the query (although, you wouldn't get the highest/lowest caches displayed in the output).
Re: Checker for highest/lowest caches of countries/regions/county
August 25, 2023 04:02PM
Hi,
thank you very much for this - I'll give it a look and some tests and get back if necessary
Re: Checker for highest/lowest caches of countries/regions/county
August 25, 2023 04:14PM
As an immediate feedback, there at least seems to be something wrong with the top_type parameter value processing?

https://project-gc.com/Challenges//80453 where I tested for "lowest" I still get the list of highest caches.
Re: Checker for highest/lowest caches of countries/regions/county
August 25, 2023 07:27PM
Your tag is disabled, so I can't see it - but I had an error in the tag I posted (which didn't result in anything changing... but)... "top_type", was renamed to just "type". If in the tag I had put "type":"lowest", then it shows the lowest caches. It defaults to "highest", so it didn't match "top_type" to anything, and then just continues to use the highest.
Re: Checker for highest/lowest caches of countries/regions/county
August 29, 2023 05:38PM
Oh, sorry - it was disabled indeed. OK - I changed the "top_type" parameter to "type" and now I get the lowest caches.

However, when I tried to run that for the lowest cache in X counties (not countries) - see tag (now enabled)
https://project-gc.com/Challenges//80562
I get exception
[string ""]:7495: table index is nil

Would you happen to have any clue what is causing this?

EDIT: perhaps the script doesn't like 1/1 setting as similar for X countries - see tag
https://project-gc.com/Challenges//80560
also fails but in another stage (after completing execution, while rendering final html output)



Edited 1 time(s). Last edit at 08/29/2023 05:42PM by sm07. (view changes)
Re: Checker for highest/lowest caches of countries/regions/county
August 30, 2023 12:24PM
Hi,

"'ll have to take a look - this is why we need a debugger in the production environment :) "
thanks already in advance - and I agree - sort of ;)

"I can say that it's unlikely you'll ever get a lowest/highest *counties* in all of Europe to fly - not sure how many counties in Europe there are, but it's probably really high, and if you've found a cache in a reasonable number of them, it's going to be too many queries. "

That's not exactly what I had in my mind, but instead having a "reasonable" total count from all of them (regardless of country). The idea is somewhat similar to my already published caches
https://www.geocaching.com/geocache/GCA0BDC_alueen-vanhin-regions-oldest-gold-challenge
or
https://www.geocaching.com/geocache/GCA0BCV_europe-regions-traveller-silver-challenge
and to be more specific, I'm actually more interested in regions than counties, but I might end up doing the county challenge as well - that's why I'm testing that as well.

However, testing with regions also ends up into some (similar?) error
https://project-gc.com/Challenges//80561
but this time rendering nothing into its output.

My initial guess based on my earlier testing (and forking) with that other script is that as I think there are caches without region/county value set within Europe, and perhaps this script (like the other did) cannot process caches with "null" region/county but crash instead.
Re: Checker for highest/lowest caches of countries/regions/county
August 30, 2023 03:12PM
When I run: https://project-gc.com/Challenges//80561 for myself (having found not that many regions in Europe), it runs almost instantly... however, I get strange results. For Schleswig-Holstein (Germany), it shows 370 caches in the results, however, there is a cache at -23m and the next is at -22m, and then a whole bunch at -1m and 0m. I'd expect it to only show the lowest one (if there is an elevation "tie", it will show all the ones that are at the lowest/highest). So, it may also be that there are some regions which are returning a ridiculous number of results for GetLowest/Highest, and thus timing out... not sure why that would happen though.

Again, needs some debug time - bear with me :)
Re: Checker for highest/lowest caches of countries/regions/county
September 02, 2023 06:41PM
No hurries. Here is also one more strange thing:

if I run
https://project-gc.com/Challenges//80689
(1 lowest of 10) for countries, I get only two US caches displayed

United States cancel16.png Fail: You found 0 out of a required 1 of the lowest elevation caches 10 cache(s).
# GCCODE elevation Found Date Name
1 GC3895 -4380 N/A HI EPE
2 GCABX9B -276 N/A Resurrection Bay Sea Caves

If I reduce that to 1 of 9
https://project-gc.com/Challenges//80666
I get only 1 US cache listed
United States cancel16.png Fail: You found 0 out of a required 1 of the lowest elevation caches 9 cache(s).
# GCCODE elevation Found Date Name
1 GC3895 -4380 N/A HI EPE

If I reduce that to 1 of <9 (anything between 1 of 1 to 1 of 8)
I get "error in HTML output"
Re: Checker for highest/lowest caches of countries/regions/county
September 25, 2023 06:52PM
Nowadays when testing this

>When I run: https://project-gc.com/Challenges//80561 for myself (having found not that many regions in Europe), it runs almost instantly... however, I get strange results. For Schleswig-Holstein (Germany), it shows 370 caches in the results, however, there is a cache at -23m and the next is at -22m, and then a whole bunch at -1m and 0m. I'd expect it to only show the lowest one (if there is an elevation "tie", it will show all the ones that are at the lowest/highest). So, it may also be that there are some regions which are returning a ridiculous number of results for GetLowest/Highest, and thus timing out... not sure why that would happen though.

>Again, needs some debug time - bear with me :)

I get always this:

[string ""]:7950: attempt to index field '?' (a nil value)
Re: Checker for highest/lowest caches of countries/regions/county
September 26, 2023 08:49PM
Okay - I've fixed that issue. I've also tried to filter out the bogus results, where it returns way too many highest/lowest for a particular area, by filtering it after the query happens.

https://project-gc.com/Challenges//80561 - for you now times out. It looks like it's trying to do the lowest cache in every region in Europe. It looks like it times out for you at about 180 regions, which, is always going to be too many queries. You might get some better execution, if we cut off the execution once you've passed, however, there's no guarantee out there that there isn't a user that has found hundreds of regions in Europe, and not found the lowest in any of them, and then the checker would fail. While doing an all-of-Europe check would be fun, I don't think it's going to ever have enough execution time.

FWIW, for me, the checker lists one cache per region (I only have 12 regions in Europe, so it doesn't time out). Unless there's an elevation tie, in which case there's two - showing the 'too-many-listed' issue is fixed.
Re: Checker for highest/lowest caches of countries/regions/county
August 30, 2023 11:17AM
I'll have to take a look - this is why we need a debugger in the production environment :) It may be that you need to have the filter set on the "same" function *and* the function that it calls.

I can say that it's unlikely you'll ever get a lowest/highest *counties* in all of Europe to fly - not sure how many counties in Europe there are, but it's probably really high, and if you've found a cache in a reasonable number of them, it's going to be too many queries.

https://project-gc.com/Challenges//80441 - this checker is for the lowest in each county in Ontario, Canada (where I'm from). I've found a cache in 30-ish of the 49 (or maybe it's 50 now) counties, the script takes about 12 seconds to run. Rounding up, that's a half a second per county, meaning you've only got about 120 queries before you time out. So, theoretically countries in Europe should be feasible.
Re: Checker for highest/lowest caches of countries/regions/county
September 02, 2023 12:03PM
Ok, let's then focus getting the region's query functioning and skip the counties for now
Re: Checker for highest/lowest caches of countries/regions/county
February 06, 2024 06:35PM
> not sure how many counties in Europe there are, but it's probably really high,

I don't know how many counties there are in all of Europe, but there's 2151 just in Switzerland, so...
Re: Checker for highest/lowest caches of countries/regions/county
February 09, 2024 05:31AM
(moved reply to main thread)



Edited 2 time(s). Last edit at 02/09/2024 05:36AM by sm07. (view changes)
Re: Checker for highest/lowest caches of countries/regions/county
February 06, 2024 06:26PM
Circling back here - is anything else required here?

When I run:
https://project-gc.com/Challenges//80561

With your user name, the results now look reasonable. I think there was an error in the 'oldest' function in that script that was fixed in an unrelated request. Also, it looks like with the recent database conversion, these queries are faster than they used to be.
Re: Checker for highest/lowest caches of countries/regions/county
February 09, 2024 05:35AM
Hmm. There is something very wrong with it now - as I no longer qualify


At first I thought this is only because of recent elevation calculation changes
(As for example in Åland regions, the caches I've found are no longer the lowest but second lowest with slight changes to elevation).

But this does not seem to explain all as for example in Monaco, I've found the lowest still listed (pearl of Monaco) but checker does not accept that although it shows it..


Additionally, running script for kapeka (cacher with high amount of finds) throws exception
[string ""]:9037: The elevation query returned that there were 0 available caches, with 1 required.
This challenge is not possible



Edited 1 time(s). Last edit at 02/09/2024 05:37AM by sm07. (view changes)
Re: Checker for highest/lowest caches of countries/regions/county
February 16, 2024 02:27AM
Okay - I've figured out the "Monaco" problem.

TL;DR : The script queries the N lowest/highest elevation caches, and then queries your finds, filtering based on the highest/lowest of the last cache in that list. So, for GC95MBM, its elevation is 1m. It asks for all your finds with maximum elevation of 1m. BUT ... it's a less-than check, not a less-than-equal. So, your find never gets returned if it has the exact same value as the lowest/highest elevated cache on the list. So, I've fixed that now - it just adds/subtracts one when asking for the min/max elevation.

Still looking at the kapeka problem... it looks like the problem region is Civitas Vaticana in Vatican City State... still haven't figured out why the query doesn't return anything.
Re: Checker for highest/lowest caches of countries/regions/county
February 16, 2024 02:45AM
kapeka mystery solved - you're filtering for Traditional Cache, and there aren't any traditional caches in Vatican City...

So, I've changed it to not error out when the challenge is unfulfillable, which would make sense in a single-criteria type check, but doesn't make sense in this sense, where you're checking every region. Obviously there are regions which might return no possible caches, but it would be difficult to explicitly exclude those. (I mean, in this case, you could just remove Vatican City State from the country list).

Anyhow... just because it doesn't error there anymore, it still doesn't work, because it times out - they've found caches in so many regions thus causing the number of queries too high to handle. So, I'm not sure how that can be resolved.
Re: Checker for highest/lowest caches of countries/regions/county
February 16, 2024 11:42AM
Strange ...? When I allow archived caches in the checker https://project-gc.com/Challenges//85758 it should find GCRDV0 LA GEO-FRANCIGENA # 0 # - CAPUT MUNDI (30m) but for cacher "no muggle" or "ainars58" no result

There are 3 archived Traditional caches in Civitas Vaticana
GCY25A (70 m ↥)
GCTJBA (35 m ↥)
GCRDV0 (30 m ↥)
Re: Checker for highest/lowest caches of countries/regions/county
February 16, 2024 06:21PM
Okay, I have fixed this problem too.

TL;DR: If there weren't any active caches, it would query for things below 0m, which, obviously these are above 0m and thus they wouldn't show up. It seems that no matter what I ask for in GetOldest, it will not return archived caches... but, you can just insert archived finds from the users' find list into the top list and do it that way. So, it now works the same way the top-favorites script works - if you're including archived caches, then any archived finds you have add to the "denominator", assuming they are lower/higher than the Nth lowest/highest active cache. So, in "no muggle" case, there are no active traditionals in VC, so anything they've found there that is archived counts... and it lists 2/1 found. I also made it print out this includes X archived caches, so it's a little clearer.
Re: Checker for highest/lowest caches of countries/regions/county
February 16, 2024 06:36PM
Thanks but ainars58 is now running an error
[string ""]:9128: attempt to compare number with string
Re: Checker for highest/lowest caches of countries/regions/county
February 18, 2024 01:43AM
Okay, I've fixed this error as well. One day this'll be just perfect :)
Re: Checker for highest/lowest caches of countries/regions/county
February 18, 2024 10:22AM
Thank you for this fix. Almost perfect, but for that there is one small step to go.

To catch all the regions in an area like Europe the checker needs to recognized that there are countries with the same region names like Belgium has the region "Limburg" and the Netherlands has also a region name "Limburg"
see https://project-gc.com/Challenges//85816
Belgium/Limburg, the lowest cache is "GC8Y24Q 18 N/A 10#rondom 't meer "
Netherlands/Limburg, the lowest cache is "GCVKVG 9 N/A Heen en weer "

in https://project-gc.com/Challenges//85815
Limburg, the lowest cache is "GCVKVG 9 N/A Heen en weer "
Re: Checker for highest/lowest caches of countries/regions/county
February 18, 2024 06:19PM
I think this is a general issue with the 'same' function.... (or uh... countries that name their states/provinces the exact same thing? :)). It would be the same for county names, for which there are plenty of duplicates. If you don't explicitly give values, then it looks at all your finds, and looks for all the possible values. Of course, when you say "region", and two regions have the exact same name, then it picks the last find you have as the country it will use for that named region.

I'm trying to work this out by using the country_region transform... but, not working out exactly as planned. I'll keep you updated.
Re: Checker for highest/lowest caches of countries/regions/county
February 18, 2024 06:27PM
The script of jpavlik "Generic Country/Region/County/State checker Improved" https://project-gc.com/Tools/Challenges?edit&scriptId=3639&addTag is picking the up. Maybe it can give you an idea
Re: Checker for highest/lowest caches of countries/regions/county
February 19, 2024 03:48PM
Okay, I think it's fixed now.

The tag must change to use "country_region", which the "same" function reverses into a filter.
https://project-gc.com/Challenges//85868

It now shows Limburg, Belguim and Limburg, Netherlands separately, and the caches are different for both. If we wanted this to work for countries as well, some analogous changes would be required... but I think I'll wait on that until it's requested.
Re: Checker for highest/lowest caches of countries/regions/county
February 19, 2024 04:28PM
Great , I think it's perfect to use
Re: Checker for highest/lowest caches of countries/regions/county
February 20, 2024 08:07PM
I think there is still one bug w.r.t archived handling - at least in my case the archived caches are showed as "founds" but twice (and each produces therefore two counts, not one). I think this is minor issue, but still an annoyance, if one would like to define x out of y limits.

See e.g.
Aland Islands check16.png OK: You found 2 out of a required 1 of the lowest elevation caches 5 cache(s). This includes 1 finds, which are now archived.
# GCCODE elevation Found Date Name
1 GC9D1ZG -298 N/A Extreme points of Finland – Depth & Altitude
2 GC9G1C8 -22 2023-05-21 Kallas Sandö Bonus
3 GC9G1C8 -22 2023-05-21 Kallas Sandö Bonus
4 GC94A44 -13 N/A Julnötter - Christmas Quiz
5 GC9KF1K -11 N/A Hermas Bonus
6 GC90N1M -10 N/A Enklinge Bonus
Re: Checker for highest/lowest caches of countries/regions/county
February 21, 2024 03:51AM
So, I've fixed (in my debug script), the duplicate listing of archived caches. ... However, something is weird, because GC9G1C8 is not archived. But, it appears when I query the finds in this function, it says every single find is archived... so I need to track that down before merging those changes over. But, it appears that's happening in the live-version too. Not sure if that's a problem with the script, or some weirdness with the database.
Re: Checker for highest/lowest caches of countries/regions/county
April 21, 2024 06:18PM
Hi, how does it look like - did you find a cause for this? I'm asking this because now I found a perfectly good working combination of parameters from which could make three challenges, if this worked.

I'm planning to use this
https://project-gc.com/Challenges//87553
as basis and vary just the number of regions needed, as it performs even for a cacher with huge number of finds (kapeka).

Only two issues remain (AFAIK) - (1) the "everything showed as archived ones" (and also the total cound per region is quite big sometimes and (2) the HTML output probably exceeds allowed length for kapeka so it should be somehow limited.

If you have the interest and time on still. fixing these, please let me know.
Re: Checker for highest/lowest caches of countries/regions/county
April 23, 2024 04:39PM
Actually never mind - I have managed to modify the Highest/Lowest in Region/County scripts into one script retaining original performance and will continue to fix the bugs found so far in it to get it working for this/these challenges. I'll publish the script once ready.
Sorry, only registered users may post in this forum.

Click here to login