Difference between revisions of "Polygon definitions"

From Project-GC
Jump to: navigation, search
(Updated documentation for new data format)
(Updated the documentation a bit more)
Line 47: Line 47:
  
 
=== Region rules ===
 
=== Region rules ===
When regions exists at [[Geocaching.com]] the region names in [[Project-GC]] must match these when possible. If they don't, it should clearly be stated in the #Notes -field, and why.
+
The best way to add rules is by using the ''Add rule'' button on the right side of the input form. This will help you keeping the syntax correct. All values that are defined as either 0 (zero) or empty strings ("") are required to fill in. Keys that has empty lists as values are optional.
  
 
There are three types of rules that can be used to import polygons, and they should be listed under the key ''polygons''. The rules are used by setting type to the ID of the rule, and then defining at least the required keys.
 
There are three types of rules that can be used to import polygons, and they should be listed under the key ''polygons''. The rules are used by setting type to the ID of the rule, and then defining at least the required keys.
Line 64: Line 64:
 
* '''internationalNames''' - The import scripts defaults to using local region and county names. But we also want to avoid using names that are impossible to write for users who are most familiar with the latin alphabet. So for example when the names are in Asian or Arabic variants we can use this parameter to override the default. But again, in cases where [[Geocaching.com]] has regions, it's most important that we match those. It's a boolean option, which defaults to false. Example:
 
* '''internationalNames''' - The import scripts defaults to using local region and county names. But we also want to avoid using names that are impossible to write for users who are most familiar with the latin alphabet. So for example when the names are in Asian or Arabic variants we can use this parameter to override the default. But again, in cases where [[Geocaching.com]] has regions, it's most important that we match those. It's a boolean option, which defaults to false. Example:
 
  { "internationalNames": true }
 
  { "internationalNames": true }
this option.
+
 
 +
When regions exists at [[Geocaching.com]] the region names in [[Project-GC]] must match these when possible. If they don't, it should clearly be stated in the #Notes -field, and why.
 +
 
  
 
Full example:
 
Full example:

Revision as of 15:54, 10 March 2021

Page description

This is an admin/moderator page to which only specific users have access. The page is used to create rules used when importing polygon data from OpenStreetMap into Project-GC. The data is then used for calculations regarding placing geocaches in their correct region/county. The data is also used for rendering maps, as for example the maps in Profile stats.

Country selector

The country selector lists every country available in the Geocaching LIVE api. By selecting a new entry the country is automatically loaded, allow it 1-2 seconds to retrieve its data.

Data shown

In the first section there is basic metadata about the country shown, like the name of the country, and the ID in the API. Then it shows who has created the last definitions of rules and when. On the right side there is a list of region names used by Geocaching.com.

The region names in Project-GC must match those of Geocaching.com. Use relevant rule options to make this happen.

Source site

Project-GC uses OSM-Boundaries.com as a source when importing its polygon data. This site is also a great source to use when building the import rules.

Rules

For Project-GC to know where a country (or region/county) exists in the world it needs the borders. In the most simple cases we can just match a country with an ID in the OpenStreetMap database and it's done. However it's not always that simple, especially for countries. For example the polygon for Finland includes Aland Islands while Aland Islands is its own country according to Geocaching.com. In this case we need to make more advanced rules defining and calculating new polygons in Project-GC.

The rule-set for countries differs a bit from regions/counties, the reason is that a country should always end with a single (multi)polygon, while regions and counties needs one (multi)polygon per region/county.

All import rules are created as JSON-strings. Generally it exists of a list of rules for the polygons itself, and the a list of rules for post-processing names. When the import tool runs it will first run all rules under polygons, and then the rules under postprocessing. Each rule is executed in the order it's listed as.

You can look at already defined definitions to get an overall understanding how the definitions should look.

Ignore databases

If the OSM data is known to be broken in a specific database version they can be blacklisted with this option.

Just write the name of the database in the input field. If multiple database names are needed, separate them by commas. Use white-space at your own disposal, they will be ignored.

The input field is currently limited to 128 bytes. There is no technical reason to this, it's more that the string will become unreadable the longer it gets.

Country rules

All countries must be defined in Project-GC. This is primarily used when rendering maps for Profile stats, but they would look a bit funny if a country was missing.[1]

The country rules are the simplest. There is no post-processing of names and there are only types of rules available.

  • includeOsmIds - List of integer IDs from OSM-Boundaries.com. These are IDs of polygons that will be included in the country.
  • subtractOsmIds - List of integer IDs from OSM-Boundaries.com. The IDs listed here will be subtracted from the includeOsmIds. For example in the case of Finland, the ID of Finland can be chosen as the country and then Aland Island can be subtracted.

Rather use one big polygon as include and a few subtract than a lot of includes. For example, USA either requires including the whole USA and then subtracting 3-5 polygons, or making a list of ~50 polygons for inclusion. There are two good reasons to rather include the whole country and then make a few subtracts.

  • It's faster during the import phase.
  • It makes a more readable definition.

The country polygons ignores the names in OpenStreetMap, names from the API will be used instead. This is also why the post-processing rule-set hasn't been implemented.

Example of how to define Norway:

{ "polygons": [ "includeOsmIds":[-2978650],"subtractOsmIds":[-2425963,-1337126,-1337397] ] }

Region rules

The best way to add rules is by using the Add rule button on the right side of the input form. This will help you keeping the syntax correct. All values that are defined as either 0 (zero) or empty strings ("") are required to fill in. Keys that has empty lists as values are optional.

There are three types of rules that can be used to import polygons, and they should be listed under the key polygons. The rules are used by setting type to the ID of the rule, and then defining at least the required keys.

  • adminLevel - This is the easiest and most common way to. In a perfect world this would be the only rule needed. Basically it includes everything with a specified adminLevel under a parentOsmId. Other optional keys are excludeOsmIds and subtractOsmIds. excludeOsmIds will simply ignore the fact that the adminLevel matched the rule. subtractOsmIds can be used to actually calculate the difference between those polygons included and the subtracted ones.
  • osmIds - This rule allows adding a list of OSM-IDs as regions. excludeOsmIds wouldn't make sense to use here, the osmId should then just not be included in the first place. subtractOsmIds is an optional key that's available though.
  • union - If there aren't polygons available that works, but there are other polygons which could be joined together to create what's needed this rule-type can be used. Required keys are name and osmIds. The later being a list of OSM-IDs. The key subtractOsmIds is also available for optional use.

Then there are three types of rules for post-processing names as well. Since they are executed in the order specified the order might be important.

  • renameOsmId - Required keys: osmId (OSM-ID) and name (string).
  • removePrefixes - Required key: prefixes (list of strings).
  • removeSuffixes - Required key: suffixes (list of strings).

Each rule can be repeated as many times as needed. Every rule can also include the optional key comment. It can be used to documenting why that specific rule exists, or why a specific osmId is subtracted for example.

Finally there is an option that can be set on the same level as polygons and postprocessing

  • internationalNames - The import scripts defaults to using local region and county names. But we also want to avoid using names that are impossible to write for users who are most familiar with the latin alphabet. So for example when the names are in Asian or Arabic variants we can use this parameter to override the default. But again, in cases where Geocaching.com has regions, it's most important that we match those. It's a boolean option, which defaults to false. Example:
{ "internationalNames": true }

When regions exists at Geocaching.com the region names in Project-GC must match these when possible. If they don't, it should clearly be stated in the #Notes -field, and why.


Full example:

 {
   "internationalNames": true,
   "polygons": [
     { "type": "osmIds", "osmIds": [123], "subtractOsmIds": [] },
     { "type": "osmIds", "osmIds": [456], "subtractOsmIds": [] },
     { "type": "adminLevel", "parentOsmId": 123, "adminLevel": 4, "excludeOsmIds": [], "subtractOsmIds": [] },
     { "type": "adminLevel", "parentOsmId": 123, "adminLevel": 6, "excludeOsmIds": [], "subtractOsmIds": [] },
     { "type": "union", "name": "Malmö", "osmIds": [123, 456], "subtractOsmIds": [] },
     { "type": "union", "name": "Lund", "osmIds": [321, 654], "subtractOsmIds": [] }
   ],
   "postprocessing": [
     { "type": "renameOsmId", "osmId": -54409, "name": "Scania" },
     { "type": "removePrefixes", "prefixes": ["foo"] },
     { "type": "removeSuffixes", "suffixes": ["bar"] }
   ]
 }


County rules

The county rules works exactly like the region rules.

Notes

This is a free-text form where you have the option to leave some notes. If there is nothing to say, just leave it empty. But for example when making a subtract it can be worth mentioning what's being subtracted.

Saving errors

There isn't much feedback to the web-ui yet. If the input form turns red the JSON is invalid, and therefore couldn't be saved. Upon save the JSON is also verified in several ways. For example it looks for keys that shouldn't exist (where they exist), and that the values are of the correct type (list, integers, strings, ...).

Preview data

There is not yet any way to preview the data. Depending on how complex the definitions are it takes hours to calculate new polygons which makes it troublesome to show the moderator some form of preview. Ultimately we should also calculate the number of geocaches included in the polygons, both with old and new definitions - This would be even more expensive.

Historical data

Historical definitions exists in the database but can not yet be viewed on the web. If they are needed due to mistakenly destroying complex definitions we can retrieve them fairly easily.

Priority countries

This is a temporary section which should be updated while work is being made, and finally removed.

On this link you can find some hints of how they have been imported before. It's a similar, but not identical, ruleset.

First priority is every country with region support at Geocaching.com. Secondary is every country which Project-GC already has support for. We can not start using the new code until we are on par with today's live functionality.

  • Special countries - These countries today uses another source than OSM, we need to pay extra attention to the difference here.
    • United Kingdom
    • Australia - Unincorporated areas causes problems here.
    • United States
    • New Zealand - Geocaching.com uses regions that doesn't exists. We are recreating them by making a union of other polygons.
    • Canada - The counties used before are from a census database. OSM doesn't have anything usable.

Notes

  1. Only true for the upcoming maps of Profile stats.