Advanced

Re: Has something changed w.r.t handling of oldest caches in PGC?

[Resolved] Has something changed w.r.t handling of oldest caches in PGC?
January 22, 2024 03:45PM
Since change of year 2024, I received error reports from many different cachers (Paulaho, wilmaaah, Nortti10) regarding my challenge https://www.geocaching.com/geocache/GCA0BDC (oldest caches per region) , with error being timeout.

Apparently the same script that used to work for myself nowadays also times out for me.

Challenge checker tag : https://project-gc.com/Challenges/GCA0BDC/73620
Challenge checker script: The oldest cache from N groups (by arisoft)

As a first aid solution, I divided the original script into two parts: European countries and the rest of the world (links below, and curiously both do work just fine) as I thought perhaps this is a temporary glitch in PGC/script.
1) Haetut alueiden vanhimmat kätköt (vain Euroopan maat) - https://project-gc.com/Challenges//84069
2) Haetut alueiden vanhimmat kätköt (vain Euroopan ulkopuoliset maati) - https://project-gc.com/Challenges//84070

However, as the problem with the original script remains, I decided to raise this here. Would anyone have a clue what is causing this?

PS. Apologies, just now realized that perhaps "misc" would have been more appproriate forum for this - so please move it there if you think so as well as apparently that is not possible to do by creator after initial posting.



Edited 3 time(s). Last edit at 01/22/2024 03:48PM by sm07. (view changes)
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 22, 2024 04:03PM
Arisoft made on 2023-12-15 19:44:59 UTC an update which could have caused this problem

Please contact him direct because he is not active in the forum
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 22, 2024 04:04PM
Ok, I will. thanks
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 22, 2024 06:38PM
I added debug statistics for database fetch times and it seems that it may take too long to get required data from the database. For example, for me it took 42 seconds to get the oldest caches for 38 required regions. If there are more found regions it will take more time.

If the first try generates time-out, the second try may success! The time needed to get the data may vary about 50%.
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 10:40AM
@arisoft I was made aware of this one today. Is it because today's PGC updates that the script returns 0/48 for everyone I test (including the cache owner)? Or is it some changes you have made?

I will try to understand it, but I have tested other scripts that uses PGC.GetOldestCaches, and they seem to work. Also, today's update is all about regions (as in the area size between country and county).
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:02PM
Found the issue. GetFinds() and GetHides() returned the region in field <i>region_pgc</i> instead of <i>region</i> as before. I have updated the api callbacks since the change wasn't intentional.

Today's update of Project-GC was actually triggered by challenge checker scripts that uses the GetOldestCaches/GetHighestCaches/GetLowestCaches callbacks. I had some discussions with @Hügh last week about these and I figured out a way to make them much much faster when a region but no county was used as filter. The problem was that it needed major database changes, and with that ~1800 lines of code to update as well.

The reason to why this script became slow is probably because I switched database backend for this callback a while back. This to solve issues with other checkers that uses GetOldestCaches(). But depending on which filters are used, one will work better than the other. I will try a patch that redirects to one DB cluster if country is a list (of more than one item), else the other DB servers. That should solve both these cases, I hope.
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:25PM
FYI, this script is originally optimized for finding the oldest cache for hundredths of regions at once. To avoid making as many separate database calls it tries to get as many oldest caches as it can with a single call which is still far from optimal but faster than the straightforward way.

Because only some of the oldest caches from any country/region/county are used from the outcome, the database could be preprocessed. For example, a subset of the 10 oldest caches from every county.
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:32PM
It's probably a smart way to do it. Because querying for a single country probably is ~1 second per country. The number of countries can be reduced from 250 by looking at which unique countries the profile name has logged, but it might still be more than a hundred.

Anyway, read my latest post. Using the other DB cluster makes it run fast enough, with lots of marginal. 6-7 seconds for one of the names mentioned above. I tested all 3 + CO.
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:45PM
It is terrible fast now. Debug statistics I added, originally showed 42 seconds per fetch and now it stated 1 second.
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:51PM
Wow. Huge improvement. Thanks @arisoft for debug info and for helping and big thanks for @magma1447 for database optimisation!

One more question though:: will this performance improvement impact (improve) the performance of checkers depending on gethighest/lowest method as well or was this only related to the original issue (ie getoldest & regions)?
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:54PM
It will affect those two as well. It's the same code, just different fields it filters/sorts on. How big the effect is one those I do not know though.
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 01:01PM
Ok, will need to test that as well later as I've had few challenges in mind but where the timeout has prevented from proceeding further with them. Thanks!
Re: Has something changed w.r.t handling of oldest caches in PGC?
January 23, 2024 12:21PM
Actually, I changed that. I added a new optional parameter to GetOldest/Lowest/Highest.
alternativeDBCluster (bool) defaults to false.

I edited this script to use it and set it to true. That should solve this issue.

I can't explain exactly when it's faster to use it or not, since I don't always know myself. But the more data that needs to be processed and the less the DB indexes helps, the better the alternative DB is. On the other hand, when indexes works perfectly, the alternative DB will be worse.

But in this case where pretty much all geocaches are considered, it's a lot of data to process, and the alternative DB will perform a lot better.

The down-side is that the data in that cluster usually is 0-4 hours older than the primary data.
Sorry, only registered users may post in this forum.

Click here to login