[HN Gopher] S2: Computational geometry and spatial indexing on t... ___________________________________________________________________ S2: Computational geometry and spatial indexing on the sphere Author : ofou Score : 90 points Date : 2022-03-13 14:27 UTC (8 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | jwuphysics wrote: | Astronomers have developed an indexing strategy using | Hierarchical, Equal Area, and iso-Latitude Pixelisation | (HEALPix): https://arxiv.org/abs/astro-ph/9905275 | durkie wrote: | i was just starting a project and considering the use of s2. i'm | going with geohash instead because I would get different values | for the s2 cell based on implementation. | | ruby: | | S2Cells::S2LatLon.new(34.1234, -84.1234).to_s2_id(30) | | => -8577780315821624425 | | S2Cells::S2LatLon.new(34.1234, -84.1234).to_s2_id(1) | | => -8358680908399640576 | | https://gojekfarm.github.io/s2-calc/ agrees with these values | | python's s2sphere (from sidewalk labs, seems like it'd be more of | a reference implementation) gives very different numbers: | | s2sphere.Cell.from_lat_lng(s2sphere.LatLng.from_degrees(34.1234, | -84.1234)) | | => Cell: face 4, level 30, orientation 0, id 9868963757887927191 | | This discrepancy feels like it'd be some kind of signed/unsigned | integer issue, but I didn't think that would be a thing in Ruby. | It also doesn't seem like the level=1 cell value in Ruby would be | such a large, specific number. I'm missing something. | kevinventullo wrote: | For what it's worth, at least in the case you described, it is | indeed a case of signed vs unsigned representations. That is, | the numbers | | -8577780315821624425 | | and | | 9868963757887927191 | | have the exact same underlying 64-bit representation, which is | all that matters for S2 (you can verify this by checking their | difference is exactly 2^64). The former is just interpreting | the 64-bit string as a signed int, while the latter is | unsigned. | | Separately, the level 1 number you mention | | -8358680908399640576 | | is fairly simple if you look at its binary representation; | almost all of its trailing digits are 0. | | Basically, you really shouldn't think of S2 values as integers, | but more as their underlying length 64 bit string. | durkie wrote: | thank you! | random4726427 wrote: | S2 cell IDs are usually uint64, so the Ruby version looks | wrong. If you cast that int64 to uint64 you get the same value | as the python version. | | Level 1 IDs look like large random numbers in decimal, but when | viewed in binary they have many trailing zeros, that are used | to truncate to make the tokens. | durkie wrote: | thanks! | panarky wrote: | "The reference implementation of the S2 library is written in | C++. Versions in other languages are ported from the C++ code, | and may not have the same robustness, performance, or features | as the C++ version." | | From https://s2geometry.io/about/overview | durkie wrote: | yes, that's the true "reference" implementation, i just | figured a repo from sidewalk labs (a google company) would be | maybe more authoritative. | Kelteseth wrote: | I'm wondering where this is used at Google and how this compares | to GDAL? | monktastic1 wrote: | It's used everywhere in Maps (of course), including where you'd | expect there to be lat/lngs. | mistrial9 wrote: | GDAL binary [0] uses a traditional C/C++ linked library | architecture, and is extensible with plugin drivers; the memory | model is therefore "local". A cloud architecture uses | distributed resources. | | Secondly, GDAL is tightly bound to Proj [1] for reprojecting | spatial coordinates data from one spatial reference system | (SRS) to another. The "sphere" libraries imply no generalized | spatial reference, instead only a single spherical one. | | [0] https://r-spatial.org/images/gis3.png | | [1] https://proj.org/ | jofer wrote: | More generally, gdal is a raster I/O library. S2 is a point | only system. It's not meant to store or work with raster data | in any way. (Sure, you can represent raster as points, but | it's hideously inefficient to do so.) | | Basically, you shouldn't ever be choosing between the two. If | you're thinking of using a representation system like S2 for | raster data (e.g. images or anything else on a regular grid), | rethink things a bit. | PLenz wrote: | As a geographer turned DS it is my duty to remind CS people that | all geo methods are compromises. No system, s2, h3, mgrs, even | lat lon are perfect for every project and you must think | carefully about what you are doing and pick your geo | representation based on your givens and druthers. | [deleted] | monkeybutton wrote: | If you had to represent arbitrary plots of land as fixed sized | vectors for a ML model, what would you use? | s2mcallis wrote: | S2 wouldn't be a bad choice, it lets you compute "coverings" | of arbitrary regions as S2 cells, with variable resolution. | Fixed size is trickier but is probably doable, especially if | you're allowed to null out unused cells. Check out | https://s2.sidewalklabs.com/regioncoverer/ | PLenz wrote: | Depends on where the land is and size and shape of the | polygons are | monkeybutton wrote: | Let's say the land is confined to north America, the size | can vary from an entire state to a single zip code (I know | zips are logical addressing, not geological, but I have | what I have), and the shape is unrestricted so it could be | non-convex. I suppose one could convert various kinds of | areas (states, cities, boroughs, ...) to lists of zipcodes | contained within and OHE them but I feel like that would be | the _worst_ solution. | Jabbles wrote: | > Disclaimer ... This is not an official Google product. | | What _is_ an official Google product? What does it mean to be | one? What should I understand from the presence or absence of | this disclaimer? Probably some sort of SLO, but what precisely? | | Tensorflow doesn't have this disclaimer: | https://github.com/tensorflow/tensorflow | daniel-thompson wrote: | > What does it mean to be one? | | That it will eventually get killed. | | > What should I understand from the presence or absence of this | disclaimer? | | That it will probably last longer. Open-sourcing helps here too | obviously. | rajman187 wrote: | I've made use of Google's S2 library across a couple different | jobs for various use cases. The C++ bindings are well-documented | and very fast, in fact I found writing my own Python bindings | allowed a few faster operations than using the Python library | directly, for example generating a spanning set of cell IDs. | | I find it liberating to not rely on a backend implementation of | geospatial indexing, or rather being agnostic to backend, simply | indexing very efficiently the 64-bit integers. | | The hierarchical nature of the Hilbert curves means that the | child cells contain the ID of the parent, and also fit perfectly | into the parent. This is in contrast with H3 where you | necessarily cannot fit the 7 child hexagons into the parent | without some overlap (see for example the logo on the H3 website | https://eng.uber.com/h3/) | rajman187 wrote: | As a fun experiment, I wanted to use the library in Rust. While | there isn't an official Rust port, I just used the library | along with CXX (https://github.com/dtolnay/cxx) with a simple | geospatial struct. | davidmurdoch wrote: | Pokemon Go used (uses?) S2 for its coordinate system. I used it | in a (now dead) location based mobile game and being able to | locate proximal s2 cells at varying granularities was really | cool. | maxpert wrote: | IDK if people have seen or tried H3. It has worked really well | for me in past https://eng.uber.com/h3/ | secondcoming wrote: | I know of at least one company that uses it. They seem happy | with it. | gregcoombe wrote: | H3 has the significant downside that children nodes do not fit | within parent nodes. So one needs to be very careful working | within any sort of multi-resolution algorithm (the most | effective uses I've seen are with a fixed resolution). S2 does | not suffer from this. | timjulien wrote: | We use S2 extensively in our infrastructure, and open-sourced our | Node.js TypeScript bindings for Google's reference C++ lib: | https://github.com/radarlabs/s2. (More context about how and why | here: https://radar.com/blog/open-source-node-js- | typescript-s2-lib...) | ducktective wrote: | This assumes the earth model to be sphere. GeographicLib [1], | however, works according to WGS84 earth model which is an | ellipsoid. | | Anyone knows why they used a sphere model and why they didn't | just use GeographicLib (MIT license)? | | [1]: https://geographiclib.sourceforge.io/ | tantalor wrote: | > Why not project onto an ellipsoid? (The Earth isn't quite | ellipsoidal either, but it is even closer to being an ellipsoid | than a sphere.) The answer relates to the other goals stated | above, namely performance and robustness. Ellipsoidal | operations are still orders of magnitude slower than the | corresponding operations on a sphere. Furthermore, robust | geometric algorithms require the implementation of exact | geometric predicates that are not subject to numerical errors. | While this is fairly straightforward for planar geometry, and | somewhat harder for spherical geometry, it is not known how to | implement all of the necessary predicates for ellipsoidal | geometry. | | http://s2geometry.io/about/overview | rurban wrote: | (2019) please. | | Last big discussion 2018: | https://news.ycombinator.com/item?id=17849546 | hobofan wrote: | I don't think GitHub projects are generally given a year tag, | as they evolve over time. ___________________________________________________________________ (page generated 2022-03-13 23:01 UTC)