Advanced

Change History

New api methods will be announced here. Changes in our data that is relevant to checkers scripts/tags will also be announced here.

Message: Max length script output

Changed By: magma1447
Change Date: September 20, 2017 02:43PM

Max length script output
Since a while back we have started to cut the the html output from checker scripts after 50 kB. This might have caused some scripts to break. There are two main reasons to this change.

1) Makes it harder to data-harvest. This is actually not of our biggest concern since it's still possible. We also know who has access to writing scripts and any abuse of the data would be reported to Geocaching HQ.

2) Some scripts are outputting huge long meaningless lists of data. It normally doesn't make sense to output a list of 5000 caches for example. The problem here of course isn't the huge list in itself. But all HTML output from the scripts are "purified" to prevent XSS and other exploits. The purification takes quite a lot of time/cpu power.

I found the issue itself while adding more proper support for script timeouts. I ran a script on myself, it was a simple challenge and I would expect the LUA script to take 0.5 seconds. It did timeout. After some analyzing I realized that it was due to returning a huge list of HTML. It took ~100 times more CPU power to handle the HTML output than to validate the user.

Some scripts do however output proper feedback with tables and images. Some of those produce quite big blobs as well. Those are mainly large (length) because they have to output things like "/images/gc-icons/traditional_16.gif" a hundred times, with an img tag around it.

I am not sure about the long-term solution for this. The current workaround is to output "better" HTML. I have created a proof of concept that should shorten the HTML a lot when being repetitive.

[code]
<html>
<body>
<style type="text/css">
#cc_HtmlFeedback table {
border: none;
border-collapse: collapse;
}
#cc_HtmlFeedback td {
border: 1px solid black;
}
#cc_HtmlFeedback .ct-t {
background-image: url('https://project-gc.com/images/gc-icons/traditional_16.gif');
width: 16px;
height: 16px;
margin: 0;
}
#cc_HtmlFeedback .ct-m {
background-image: url('https://project-gc.com/images/gc-icons/multi_16.gif');
width: 16px;
height: 16px;
margin: 0;
}
#cc_HtmlFeedback .ct2-tm {
background-image: url('https://project-gc.com/images/gc-icons/traditional_16.gif'), url('https://project-gc.com/images/gc-icons/multi_16.gif');
width: 16px;
height: 32px;
background-repeat: no-repeat, no-repeat;
background-position: 0 0, 0 16px;
}
#cc_HtmlFeedback .ct2-tmu {
background-image: url('https://project-gc.com/images/gc-icons/traditional_16.gif'), url('https://project-gc.com/images/gc-icons/multi_16.gif'), url('https://project-gc.com/images/gc-icons/unknown_16.gif');
width: 16px;
height: 48px;
background-repeat: no-repeat, no-repeat, no-repeat;
background-position: 0 0, 0 16px, 0 32px;
}
</style>

<div id="cc_HtmlFeedback"> <!-- A div with this id already exists, your html-output will be added to that div. -->
<table>
<tr>
<td>
<p class="ct-t"></p>
<p class="ct-m"></p>
</td>
<td>
<p class="ct-m"></p>
</td>
</tr>
<tr>
<td class="ct2-tm">
</td>
<td class="ct2-tmu">
</td>
</tr>
</table>
</div>
</body>
</html>
[/code]

Note that the html and body tag isn't needed from the checker-script, since it already exists on the web. The div with id cc_HtmlFeedback also already exists.

The output will look like [url=http://1447.se/tmp/css-img.html]this[/url].
Note that there are two different examples, the first row is easier, but less byte-saving. Also, this is only a good solution when there are a lot of images, haven't done the math, but it might be worth it with 100 of them.


As mentioned, not sure this will be the long-term and final solution. An alternative could be a predefined CSS for the checker scripts for example.

Another variant is callbacks that produces trusted HTML. For example RenderTableFromAssociativeArray(data). I bet this will will require quite a few different callbacks though, which might make it quite tiresome to build them, and also tiresome every time someone wants something that doesn't exist.

This can of course be combined. We could still allow html output. That output could contain for example [i]<div id="CC_1"><div>[/i], and the script could return { trustedHtml: { 1: [ RenderTableFromAssociativeArray, data ] }}
We could then purify part of it, and insert the trust html blobs.

To be honest I am not sure what the best solution is.


The ultimate goal in my opinion is that all scripts/tags should output usable log examples and script output that is worth looking at.

Original Message

Author: magma1447
Date: September 20, 2017 02:41PM

Max length script output
Since a while back we have started to cut the the html output from checker scripts after 50 kB. This might have caused some scripts to break. There are two main reasons to this change.

1) Makes it harder to data-harvest. This is actually not of our biggest concern since it's still possible. We also know who has access to writing scripts and any abuse of the data would be reported to Geocaching HQ.

2) Some scripts are outputting huge long meaningless lists of data. It normally doesn't make sense to output a list of 5000 caches for example. The problem here of course isn't the huge list in itself. But all HTML output from the scripts are "purified" to prevent XSS and other exploits. The purification takes quite a lot of time/cpu power.

I found the issue itself while adding more proper support for script timeouts. I ran a script on myself, it was a simple challenge and I would expect the LUA script to take 0.5 seconds. It did timeout. After some analyzing I realized that it was due to returning a huge list of HTML. It took ~100 times more CPU power to handle the HTML output than to validate the user.

Some scripts do however output proper feedback with tables and images. Some of those produce quite big blobs as well. Those are mainly large (length) because they have to output things like "/images/gc-icons/traditional_16.gif" a hundred times, with an img tag around it.

I am not sure about the long-term solution for this. The current workaround is to output "better" HTML. I have created a proof of concept that should shorten the HTML a lot when being repetitive.

[code]

























[/code]

Note that the html and body tag isn't needed from the checker-script, since it already exists on the web. The div with id cc_HtmlFeedback also already exists.

The output will look like [url=http://1447.se/tmp/css-img.html]this[/url].


As mentioned, not sure this will be the long-term and final solution. An alternative could be a predefined CSS for the checker scripts for example.

Another variant is callbacks that produces trusted HTML. For example RenderTableFromAssociativeArray(data). I bet this will will require quite a few different callbacks though, which might make it quite tiresome to build them, and also tiresome every time someone wants something that doesn't exist.

This can of course be combined. We could still allow html output. That output could contain for example [i]
[/i], and the script could return { trustedHtml: { 1: [ RenderTableFromAssociativeArray, data ] }}
We could then purify part of it, and insert the trust html blobs.

To be honest I am not sure what the best solution is.


The ultimate goal in my opinion is that all scripts/tags should output usable log examples and script output that is worth looking at.