[HN Gopher] How to build a IP geolocation database from scratch? ___________________________________________________________________ How to build a IP geolocation database from scratch? Author : incolumitas Score : 343 points Date : 2023-09-14 11:00 UTC (11 hours ago) (HTM) web link (ipapi.is) (TXT) w3m dump (ipapi.is) | fasteo wrote: | >>> Consider Open Source Geolocation Projects | | Not the definition of "from scratch" in my book | dboreham wrote: | Interesting but this isn't actually how geolocation is done, | right? The ARIN/RIPE data isn't sufficiently accurate to be | useful beyond country. Commercial geolocation involves | correlating client IP vs known physical location e.g. from WiFi | AP or mailing a package to the user. At least that's what I have | been told over the decades. | shortrounddev2 wrote: | I work in adtech and this is how we do geolocation. There's | also device geolocation but if the user doesn't consent to | sharing their GPS data with us, we just use IP address for | targeting. Common provider for this is Maxmind; they ship a | database that you host locally and query | oh_come_on wrote: | [dead] | tiffanyh wrote: | Does Cloudflare have the same data as Maxmind? | | Because Cloudflare and Maxmind geolocate me to the exact same | longitude/latitude. | klaussilveira wrote: | CloudFlare uses Maxmind: https://developers.cloudflare.com/ | support/network/configurin... | dawnerd wrote: | Even the free maxmind db is accurate enough for most | applications. | klaussilveira wrote: | Since you are in adtech: do you buy MaxMind, or roll your | own? Are there any providers for US-only data, and therefore, | cheaper? | shortrounddev2 wrote: | We licensed Maxmind's DB recently (it's like $300 a year or | something). idk if there are US-only databases. Our | customers are all in the US, and we use geo IP to filter | european users for compliance (GDPR and otherwise) | ChopSticksPlz wrote: | This is a very useful .csv, what is the license? Is it free for | personal and commercial use? | jl6 wrote: | Is anybody maintaining a historical archive of "IP address | metadata" (which would include geolocation)? | | If I have logs from 10 years ago, can I look up information about | that IP as it was at the time? | sneak wrote: | I feel like a more useful and accurate way would be to buy client | ip and GPS location data in bulk from one of the mobile data | brokers who have their spyware embedded in zillions of popular | apps/games and then group it by /24 or something. | johnklos wrote: | I think it's interesting that the one IP range I decided to check | has correct information on the ipapi.is web site, but | unambiguously incorrect information in the downloadable | geolocationDatabaseIPv4.csv. Somehow Bedford, New Hampshire | (which came straight from WHOIS) became Bedford, Texas. | | How'd that happen? | alberth wrote: | What are common use cases for needing IP geolocation? | kiririn wrote: | A modern version of the ping-based geoip mentioned | | https://github.com/Ne00n/yammdb | JoshGlazebrook wrote: | This just links to a mmdb file that is already compiled, there | isn't anything relevant to show this is a "modern" | implementation of anything if the implementation isn't | available. | mootothemax wrote: | Any suggestions for geolocating datacenter IPs, even very | roughly? I'm analysing traceroute data, and while I have known | start and end locations, it's the bit in the middle I'm | interested in. | | I can infer certain details from airport codes in node hostnames, | for example. | | It would also be possible - I guess - to infer locations based on | average RTT times, presuming a given node's not having a bad day. | | Anyone have any other ideas? | | Edit: A couple of troublesome example IPs are 193.142.125.129, | 129.250.6.113, and 129.250.3.250. They come up in a UK traceroute | - and I believe they're in London - but geolocate all over the | world. | toast0 wrote: | Those IPs are owned by Google and NTT, who both run large | international networks and can redeploy their IPs around the | world when they feel like it. So lookup based geolocation is | going to be iffy, as you've seen. | | Traceroute to those IPs certainly looks like the networking | goes to London. | | The google IP doesn't respond to ping, but the NTT/Verio ones | do. I'd bet if you ping from London based hosting, you'll get | single digit ms ping responses, which sets an upper bound on | the distance from London. Ping from other hosting in the | country and across the channel, and you can confirm the lowest | ping you can get is from London hosting, and there you go. It | could also be that its connectivity is through London, but it's | elsewhere --- you can't really tell. | | Check from other vantage points, just to make sure it's not | anycast; if you ping 8.8.8.8 from most networks around the | world, you'll get something nearby; but these IPs give | traceroutes to london from the Seattle area, so probably not | anycast (at least at the moment, things can change). | | If you don't have hosting around the world, search for public | looking glasses at well connected network that you can use for | pings like this from time to time. | dontdoxxme wrote: | https://ensa.fi/papers/geolocation_imc17.pdf has some ideas. | | Using RIPE atlas probes to get RTT to the IPs from known | locations is close to your idea and probably the best anyway. | tyingq wrote: | This looked promising: | | _" TULIP's purpose is to geolocate a specified target host | (identified by IP name or address) using ping RTT delay | measurements to the target from reference landmark hosts whose | positions are well known (see map or table)."_ | | https://tulip.slac.stanford.edu/ | | But the endpoint it posts to seems dead. | vinay_ys wrote: | > A couple of troublesome example IPs are 193.142.125.129, | 129.250.6.113, and 129.250.3.250. They come up in a UK | traceroute - and I believe they're in London - but geolocate | all over the world. | | If I'm running a popular app/web service, I would have my own | AS number and I will have purchased a few blocks of IP | addresses under this AS and then I would advertize these | addresses from multiple owned/rented datacenters around the | world. | | These BGP advertisements would be to my different upstream | Internet service providers (ISPs) in different locations. | | For a given advertisement from a particular location, if you | see a regional ISP as upstream, you can make an educated guess | that this particular datacenter is in that region. If these are | Tier 1 ISPs who provide direct connectivity around the world, | then even that guess is not possible. | | You can see the BGP relationships in a looking glass tool like | bgp.tools - | https://bgp.tools/prefix/193.142.125.0/24#connectivity | | If you have ability to do traceroute from multiple probes | sprinkled across the globe with known locations, then you could | triangulate by looking at the fixed IPs of the intermediate | router interfaces. | | Even this is is defeated if I were to use a CDN like Cloudflare | to advertise my IP blocks to their 200+ PoPs and ride their | private networks across the globe to my datacenters. | mannyv wrote: | [dead] | bullen wrote: | Here is a solution for those that care about speed: | | https://www.miyuru.lk/geoiplegacy | hddqsb wrote: | Somewhat relevant: Google Maps can learn the location of your IP | based on which locations you browse in the map. If you browse a | specific location enough times, it will use that as the default | location when you open Google Maps, even if you clear all | cookies. (I discovered this just from using Google Maps, and I'm | a little concerned by the privacy implications, considering that | multiple people may share an IP address.) | gniv wrote: | I suspect it's the other way around. Google just has a very | good IP geolocation db, so it uses that when you browse, absent | any other info. | hddqsb wrote: | Google certainly uses its geolocation DB, but it _also_ | learns based on map browsing patterns. | | To clarify, the scenario I described is as follows: 1. | Initially, when I open Google Maps in a clean browser it | defaults to my real location. 2. I repeatedly browse some | other location. 3. When I open Google Maps in a clean | browser, it defaults to that other location. The _only_ | reason for Google Maps to pick that other location is my map | browsing. | gniv wrote: | Thanks for clarifying. That is indeed surprising and you | are probably right. | netsharc wrote: | Well it has reporting beacons all over the world with GPS | receivers, in the form of Android phones, and perhaps Google | Maps users on iPhone too.. | is_true wrote: | That would explain why it sometimes it thinks I'm in a river I | paddle often and other times where I have my summer house. | overcast wrote: | Step 1: Download Geolocation Database | Aachen wrote: | Scroll down, the article is confusingly below that | nonethewiser wrote: | Step 1: Download Geolocation Data | | Unless you think CSV is a database? | debesyla wrote: | Maybe a dumb question (I have no knowledge), but why wouldn't | we think of .CSV files as databases? It can have columns and | rows filled with information and isn't that what makes a | thing a database? | nobleach wrote: | Best I can guess here, the reply is considering relational | databases as "real databases" and flat files.... not real. | nobleach wrote: | Are we really going to do the mincing of words here? Did you | need the word "dump" or "export" before you understood? | Although I wasn't wild about the original poster's "step 1" | terseness, it's silly to think a normal person wouldn't be | able to parse the sentence well enough to understand | "download the database contents - perhaps stored in CSV | format". | tmpX7dMeXU wrote: | If in your mind database implies a type of technology and not | something conceptual, you're really just outing yourself as | someone that needs someone between you and the boardroom. | Certainly not something to show off on Hacker News. | n2dasun wrote: | Step 1. Download Visual Basic | nanmu42 wrote: | Thanks for sharing. | | I have heard there is much effort to use BGP data to build GeoIP | database. | bjornsing wrote: | I expected traceroute to play a bigger part in this. If you know | the route to an IP address and the location of routers, perhaps | even from a few different servers, then you should be able to | locate it fairly well. | T3RMINATED wrote: | [dead] | TZubiri wrote: | "how to scrape an ip geolocation database" | | You know you can just run a whois query per ip you want to | analyze, no point in scraping the whole ipvN space. | incolumitas wrote: | I have to scrape the whole IP address space since I offer | location information as part of my API. | | Also I only need to scrape as many WHOIS records as there are | different networks out there. So for example for the IPv4 | address space, there are much less networks as there are IPv4 | addresses (2^32). | | Also, most RIR's provide their WHOIS databases for download. | | Therefore, "scraping" is not really the correct word, it's an | hybrid approach, but mostly based on publicly available data | from the five RIR's. | notlukesky wrote: | What was the easiest and the most frustrating part? | djbusby wrote: | The whois data for IP is not accurate. | gsich wrote: | whois has no sane format. | louison11 wrote: | If you don't want to do this yourself, you can actually just get | Cloudflare to do it for you for free using a simple Worker since | all Cloudflare requests contain approximate IP location | information. | | You can also just send a request to my URL (Cloudflare Worker | operated - so it should have global low latency): | https://www.edenmaps.net/iplocation | | Use it for small applications, I don't mind. Just don't start | sending me 10M requests per day ;-) | oh_come_on wrote: | [dead] | tiffanyh wrote: | This is excellent! | | Would you mind open sourcing the code for that? | louison11 wrote: | This is the code running this endpoint: | export function onRequest(context) { return new Respo | nse(JSON.stringify([parseFloat(context.request.cf.longitude), | parseFloat(context.request.cf.latitude)]), {headers: | {"Content-Type": "application/json;charset=UTF-8"}}) } | | This is a function on Cloudflare Pages (which is just a | different name for Cloudflare Workers). Minor adjustment | needed for Workers (get rid of "context", I believe) | emadda wrote: | Does anyone know how accurate Cloudflare geolocation is (for | workers requests)? | reincoder wrote: | I work for IPinfo and we do ping based geolocation. The best | thing you can do to verify geolocation accuracy is the | following: | | - Download a few free IP databases - Generate a random list | of IP addresses - Do the IP address lookups across all those | databases - Identify the IP address that can be pinged - | Visit a site that can ping an IP address from multiple server | - Sort the results by lowest avg ping time | | Then check where the geolocation provider is locating the IP | address and what is the nearest server from there. | banana_giraffe wrote: | As accurate as MaxMind[1], since that's what they use [2]. In | my experience, it's reasonably accurate for the US, less so | for other countries. MaxMind publishes some accuracy data | which might be an interesting starting point [3] | | That said, for any analytics use cases of this data, be aware | that MaxMind will group a lot of what should be unknowns in | the middle of a country. Or, in the case the US now, I think | they all end up in the middle of some lake, since some farm | owners in Butler County, Kansas got tired of cops showing up | and sued MaxMind. It can cause odd artifacts unless you | filter the addresses out somehow. | | 1 https://developers.cloudflare.com/support/network/configuri | n... | | 2 https://www.maxmind.com/en/geoip-demo | | 3 https://www.maxmind.com/en/geoip2-city-accuracy-comparison | matwood wrote: | Yeah, MaxMind is the best I have used with caveats. You | need to update it frequently, and you need to allow for | overrides. | [deleted] | carstenhag wrote: | I'm in Munich. Cloudflare tells a position that is 730km to the | north in a random forest. | Aachen wrote: | Or you download an IP database rather than sharing with a third | party which IP address is likely connecting to your service | with a third party | hotgeart wrote: | Located 100km from the Somali coast... I'm in Brussels, | Belgium, thx for protecting my privacy :D | louison11 wrote: | The result is [lon, lat]. You've most likely copied it onto | Google maps, which works with [lat, lon]. Believe it or not, | the industry still hasn't come up with a standard order. | cstuder wrote: | Question: What's the motivation to put coordinates in one's own | WHOIS record? (geoloc/geofeed) | incolumitas wrote: | Many service providers actually want their clients to be able | to locate them. | dontdoxxme wrote: | geofeed is used by big CDNs, it can actually help save money | for the provider by meaning a CDN uses a more optimal network | location. | nonethewiser wrote: | Comments seem fairly dismissive but I actually found this really | interesting. It reminds me of a task I had in my first position | to add PostGIS to our database and a location based search. That | was based off addresses and zipcodes. | mannyv wrote: | That's relatively simple to do, even in mysql. One trick is to | use a square instead of a circle, which avoids a lot of math. | junto wrote: | As someone that lives in a country where the national language is | not my first language, I hate websites that use IP location to | make assumptions about my choice of language and it being forced | on me based on a lazy assumption, when my browser is sending | language headers quite clearly, and they are ignored. | jwie wrote: | The easiest way to get a geolocation is to ask the user. Maybe | they'll just tell you, and if that's good enough for your | application there's no need for such solutions. | jedberg wrote: | It all depends on what you want to use it for and how accurate it | needs to be. | | The best way to build a geolocation service is to have a billion | devices that report their location to you at the same time they | report their IP to you. That's basically Apple and Google. They | have by far the best geolocation databases in the world, because | they get constant updates of IP and location. | | The trick is basically to make an app where people willingly give | you their location, and then get a lot of people to use it. | That's the best way to build an accurate geo-location database, | and why every app in the world now asks for your location. | | 4-square had the right idea, they were just ahead of their time. | flounder3 wrote: | Even 10 years ago, Apple internal privacy policies prevented | itself from collecting precise lat/long. We had to use HTTP | session telemetry to determine which endpoints were best for a | given IP (or subnet, but not ASN), which informed our own | pseudo-geoIP database so we knew which endpoint to connect to | based on real world conditions. | | Even still, it had to be as ephemeral as possible for the sake | of privacy. We weren't allowed to use or record results from | Apple Maps' reverse geo service outside of the context of a | live user request (finding nearby restaurants, etc). | jedberg wrote: | You don't need precise lat/lon to make a good database. Even | a 1km circle would be more than enough. | | > but not ASN | | Why wasn't ASN allowed? That's what Netflix used to make | endpoint routing decisions and worked really well. | flounder3 wrote: | You're not wrong, but privacy concerns were paramount. | | ASNs were allowed but too vague. We needed more | granularity. Corporate proxies, subdelegations, many | providers aggregating announcements below /24, etc. | [deleted] | [deleted] | bagels wrote: | Surely someone is using online shopping shipping addresses for | this? | SirMaster wrote: | These IP geolocation lookups never seen to work for me. | | They are always multiple states off, and checking multiple | different services pretty much never even seem to agree. | reincoder wrote: | First, I am big fan of your articles even before I joined IPinfo, | where we provide IP geolocation data service. | | Our geolocation methodology expands on the methodology you | described. We utilize some of the publicly available datasets | that you are using. However, the core geolocation data comes from | our ping-based operation. | | We ping an IP address from multiple servers across the world and | identify the location of the IP address through a process called | multilateration. Pinging an IP address from one server gives us | one dimension of location information meaning that based on | certain parameters the IP address could be in any place within a | certain radius on the globe. Then as we ping that IP from our | other servers, the location information becomes more precise. | After enough pings, we have a very precise IP location | information that almost reaches zip code level precision with a | high degree of accuracy. Currently, we have more than 600 probe | servers across the world and it is expanding. | | The publicly available information that you are referring to is | sometimes not very reliable in providing IP location data as: | | - They are often stale and not frequently updated. | | - They are not precise enough to be generally useful. | | - They provide location context at an large IP range level or | even at organization level scale. | | And last but not least, there is no verification process with | these public datasets. With IPv4 trade and VPN services being | more and more popular we have seen evidence that in some | instances inaccurate information is being injected in these | datasets. We are happy and grateful to anyone who submits IP | location corrections to us but we do verify these correction | submissions for that reason. | | From my experience with our probe network, I can definitely say | that it is far easier and cheaper to buy a server in New York | than in any country in the middle of Africa. Location of an IP | address greatly influences the value it can provide. | | We have a free IP to Country ASN database that you can use in | your project if you like. | | https://ipinfo.io/developers/ip-to-country-asn-database | caribdude wrote: | [dead] | Daviey wrote: | Would you consider no-signup inspection of the data you hold on | the requesters IP address? I would love to see what you have on | MY IP address, and if sufficiency accurate it feels that it | would be a good incentive to sign up to use commerically. | | It feels like it couldn't be abused by 'freeloaders', because | i'd guess their use-case is viewing other peoples. | reincoder wrote: | We have a very open approach to our data. In fact, our | website is extremely accessible. It is quite useful for | researching IP addresses and does not require signing up. The | data is largely available to view on the website. Although we | display all IP address meta data on the home page, if you | intend to use our website frequently, I recommend utilizing | the IP data pages. | | You can enter IP addresses on the right side to look up | information here: https://ipinfo.io/what-is-my-ip | | Additionally, we offer some enjoyable tools that you can use | here: https://ipinfo.io/tools | | The CLI tool is particularly entertaining. | | You can also use our API service without signing up, with a | limit of 1000 requests per day. | | If you do choose to sign up for a free account, you will | receive 50,000 requests per month, free IP databases, a bulk | lookup feature, and more. | kam wrote: | This is literally the most prominent thing on the | https://ipinfo.io home page. | qingcharles wrote: | Huh, that's cool. It got my home IP about 15 miles from | where I am, but still not bad. | | Wait - how does this work for cell IPs? A lot of cellphone | v4 IPs are now shared between hundreds or thousands of | devices, right? | reincoder wrote: | I work there, and I am supposed to know these things, but | I don't exactly :/ | | It probably has something to do with important routers. | What tags do we show when you visit the IP data page? The | IP data page can be accessed by visiting | ipinfo.io/<IP_address>. | | We use the generic term "data experts," but it actually | consists of about 2 dozen engineers, including data | engineers, data scientists, infrastructure engineers, | backend engineers, and a great technical CEO working on | all that. All those folks have gone on a boating trip off | the coast of Spain for a retreat.....except for me. | | I will ask them and try to circle back with some answers. | Daviey wrote: | That's embarrassing for me... I thought that was a static | image of an example. And I did look through the site | looking for a search. Oops. | theogravity wrote: | How does that work with edge servers that use anycast to assume | the same IP across different regions? | SnorkelTan wrote: | Aren't any cast addresses a specific subset of ips and thus | knowable? Iirc, each autonomous system is allocated anycast | ip space? | TheClassic wrote: | Your comment is extremely interesting and what I was hoping to | learn from the article (without an existing source of | information, how do we determine the location of an IP | address). Thank you! | reincoder wrote: | I really appreciate. Thank you. We are very transparent about | our process. If you have any questions, you can always reach | out to us. | | We have a simplified explanation of our probe network here: | https://ipinfo.io/blog/probe-network-how-we-make-sure-our- | da... | | The only update is the number of servers is like 600+ now. | The probe network is growing extremely rapidly. | | Our IP geolocation process is quite complicated, and we have | a team of data engineers, infrastructure engineers, and data | scientists working on various aspects of it. Therefore, our | approach is users can ask us questions, and we will try our | best to answer them. | freedomben wrote: | Just wanted to let you know, it's this transparency that | turned me into a customer! | | I love your company and service, but I hate your pricing. I | work with a lot of small clients/apps that paying for usage | would be a no-brainer, but the defined monthly price | buckets don't make any economical sense at their scale. If | you added a "pay as you go" tier that a small app could | reasonably start by using dollars worth of API calls per | month and grow from there, I'd be spreading your seed all | over the place. I'm not saying this to rag on you, just | trying to provide some constructive feedback as a thank you | for your info sharing! | reincoder wrote: | Thank you very much; I really appreciate your feedback. | This is not the first time I have heard this. The | solution is to try to take as much advantage as you can | from the free tier. | | # Check out the free IP databases | | https://ipinfo.io/products/free-ip-database | | The free databases come with commercial usage permission, | and because they are databases, you can make unlimited | lookups from them. The databases provide full accuracy | and are updated daily. They are just a subset of our IP | geolocation database that only provides IP to Country | information. | | # Complement the database with the API service | | If you only want city-level information, switch to the | API service. Use the database to look up IP-to-country | information as many times as you want. However, use the | API service only when necessary. | | Additionally, if you include a credit link to us, we will | double your API limit to 100k/month. Visit | https://ipinfo.io/contact/creditlink. | | # Cache data | | All of our API libraries have native caching support. We | strongly recommend that users reduce their number of | requests by caching the response. I highly recommend you | check out our libraries: https://github.com/ipinfo | | --- | | The only challenge with the free IP databases is that you | need to host the database somewhere to lookup the IP to | Country information. Having an API service with nearly | unlimited lookups for IP to Country information will be | fantastic. | | If you know someone who has an IP to Country as API | service please, let me know. We only require an | attribution for using our database. If you have a similar | service that is popular but don't want to maintain it let | us know as well, we can takeover the site and host it | ourselves with the IP to Country data. | freedomben wrote: | Thank you, that's super useful info. I didn't realize you | had an Erlang library! I'm definitely going to be putting | that to use :-) | sambazi wrote: | [flagged] | detourdog wrote: | I just noticed that my wifes iphone uses the same mycingular ip | address while driving accross 3 states over 5 hours.l while | checking mail. | inemesitaffia wrote: | There's several options/techniques for doing it. But just | imagine you have a permanent zero overhead VPN. | | I don't know if that provider terminates long running calls, | but the calls would stay up too regardless of tower. | detourdog wrote: | Yes, I'm sure it is iOS anti-tracking and directly related | to why firewall apps inside SIP my not know what is going | on. | Vendan wrote: | More likely to be just standard Mobile IP | https://en.wikipedia.org/wiki/Mobile_IP. Fairly standard | stuff, can cause some false positives around traveling | (I've seen people get freaked out about stuff like "This | person just logged in from their home state and then less | then an hour later logged in from France!" when it was | just mobile IP treating their phone as still in the US | while they were in France on a trip, but their laptop | connected over normal internet was seen as coming from | France) | detourdog wrote: | this was a consistent ip address nothing to do with | location and nobody was freaked out. | matsur wrote: | ICMP response time not useful for "locating" an anycasted | address, some of which have logical location associated with | them. See https://blog.cloudflare.com/icloud-private-relay/ for | an example | cuu508 wrote: | Well, at least you can detect it is an anycast address, and | mark it as such. | EwanToo wrote: | Have you considered making your database available for download | as Parquet format so people could just copy the file to S3, | Google Cloud, etc, and query it immediately with various tools? | | I know it can be done with CSV but it's not as smooth. | chaps wrote: | Not gonna lie, this creeps the heck out of me. | fragmede wrote: | Your IP address is LEAKING! | reincoder wrote: | Thousands of people live in a zip code, while hundreds and | thousands of people live in a city. We are literally giving | away that data for free through our API and database. The | creepiness of IP geolocation is mostly a meme. | | IP geolocation is mainly used in cybersecurity and marketing | analytics. There are many ways to geolocate someone. I once | came across a project that could estimate the country a user | is from based on their writing style and grammar mistakes. | For example, American people sometimes use "should of" | instead of "should have". Knowing the geolocation of an IP | address isn't super creepy. It's just how things work on the | internet. | chaps wrote: | And you're literally advertising this project as being | helpful for targeted ads. So it's pretty clear from the get | go that what you consider creepy isn't what I consider | creepy. And having done enough reidentification work to | scare myself, "thousands of people" might as well be a | couple dozen or less. I get why you're defensive and why | you think it's not creepy, but calling it a "meme" is | insultingingly dismissive. | | Just because it's "how things work on the internet" doesn't | make its mass collection right. Under the same logic, any | side channel attack is just "how it works", and its abuse | warrants no ethical question. | reincoder wrote: | I grok and understand your concern. I am not being | defensive; I am just trying to provide an explanation. I | really enjoy having conversations like this with | developers as honestly and empathetically possible. | | I apologize if I was rude in any way by saying the word | "meme". I saw a sister comment and thought you were being | sarcastic. There is a popular meme about "I have your IP | address", so I thought you were referencing that. I have | had conversations with many young people who were | concerned about their IP address being leaked through a | game server. Therefore, I try to use humor to alleviate | their stress. However, I now realize that this situation | was different, and I am sorry for not understanding that. | | We provide a service that helps users keep their | internet-connected services secure by providing IP | metadata information. Are you being attacked by malicious | actors? Use our free IP database to identify the location | and ASN to block them. Do you want to restrict access to | your service to certain regions? Do that for free with | our services. | | We have the most accurate data available, and yet we | offer the most generous free tier. We provide a full | accuracy IP database for free, without any range | aggregation, and with daily updates and a commercially | permissible license. We have built a community forum | solely dedicated to answering users' questions. We invest | in website tools and open-source tools, all with the goal | of helping users maintain the security and functionality | of their services. | | We do have premium tier services, but if you use our free | data as a foundation, you can always replicate those | premium features to a reliable degree. | | Our IP metadata information is being used in marketing | and sales intelligence. It is the same data that you use | to protect your internet connected devices, used by our | customers to sell you something. | | IP metadata information that we provide is a cornerstone | of keeping the internet safe and accessible for everyone. | That is how things just are. The deepweb is immune to IP | meta data information, and that is why it is such a messy | and chaotic place. | | That is just truth of the internet. We are essential and | we prefer to be open about our process and listen to our | stakeholders (users + customers + non-users). | chaps wrote: | Thank you for the well thought out response. I disagree | with just about everything you say, but I understand | where you're coming from and I appreciate the validation | that the use of a VPN is more important than it's ever | been. As a professional courtesy: calling yourself | "essential" is an enormous red flag and you might want to | consider different phrasing. | reincoder wrote: | I should have used a different phrasing. :) I was reading | an article about essential workers today, and that word | popped up in my head when I wrote the comment. | | It's good that you are using a VPN. I advocate for the | usage of VPNs, and many VPN companies actually use our | data to verify their server locations. In the VPN | industry, VPN companies get their VPN servers from | specialized hosting services that cater to dozens of VPN | companies. You can check out the ASNs of the VPN IP | addresses to find them. | | - https://ipinfo.io/AS136787 | | - https://ipinfo.io/AS16247 | | VPN companies use our IP geolocation data to confirm the | actual location of their servers. Let me tell you a fun | story. One VPN company claimed to have a server in the | Bahamas, but upon investigation, we discovered that the | server was actually located in New York. It was a | surprising find. Getting a server in the Bahamas is more | challenging than getting one in NY. Just imagine users | thinking their internet activity is immune to US | jurisdiction because they are using a VPN service based | in Bahamas but in fact it is actually located in NY. So, | we might not be essential, but we are certainly very | useful! | | Thank you for the great conversation, dude. Appreciate | it. | wpietri wrote: | For sure. When people work in any industry long enough, | it's easy to stop thinking about the basics. E.g., a | retail butcher thinks of his work very differently than a | cow or a vegan does. | | When people work in advertising, they mostly forget that | the core of their business is for-profit manipulation of | people with little or no regard for truth or the people | concerned. But I personally think that's kinda creepy, | and only getting more so as it goes from broad | manipulation of millions via mass media down to | thousands, hundreds, or single individuals. | goodpoint wrote: | Together with the tons of data leaked by browsers it makes it | very easy to track people across places and devices. | giantrobot wrote: | You might want to unplug your router then. A conceit of being | connected to a network is you're connected to the network. If | you can see other nodes they can see you. | welder wrote: | Great comment. I'm a big fan and customer of IPinfo, using your | API in our login notification emails to say "You just logged in | from Berlin, Germany. If this wasn't you click here." To | provide country data for customers in their audit logs. And for | anti-spam and fraud detection. | chankstein38 wrote: | That's pretty neat! You're basically using ping triangulation! | sib wrote: | Trilateration (same technique as used for mobile network | location - in addition to the GPS on the phone) | incolumitas wrote: | Big fan of what articles? On https://incolumitas.com/ or on | https://ipapi.is/? | | Great idea with latency triangulation, I used latency | information for a lot of things, especially VPN and Proxy | detection. | | But I didn't assume you can obtain that accurate location. I am | honestly impressed. But latency triangulation with 600 servers | gives some very good approximation. Nice man! | | Some questions: | | - ICMP traffic is penalised/degraded by some ISP's. How do you | deal with that? | | - In order to geolocate every IPv4 address, you need to | constantly ping billions of IPv4's, how do you do that? You | only ping an arbitrary IP of each allocated inetnum/NetRange? | | - Most IP addresses do not respond to ICMP packets. Only some | servers do. How do you deal with that? Do you find the router | in front of the target IP and you geolocate the closest router | to the target IP (traceroute)? | carlhjerpe wrote: | You can guess pretty well how IP's are related by BGP | announcements, so as long as a few per block and if small, | ASN. You can use that logic. | withinboredom wrote: | I'm very curious why you'd do VPN/proxy detection... | | But at a previous company I worked at that ran a very large | chunk of the internet, we did indexing of nearly the entire | internet (even large portions of the dark web) approximately | every two weeks. There were about 500 servers doing that non- | stop. So, I think it is relatively reasonable if you have 600 | servers to do that. | meroje wrote: | In the business of media streaming, rightholder will | require that you check for vpn and proxies in addition to | countries when deciding if a given viewer will be able to | stream a given media. | withinboredom wrote: | Does that actually work? That could explain an issue with | a particular streaming service I use. There are currently | some ongoing routing issues in BGP land and my ISP. When | trying to stream, it says I'm using a proxy, so due to | the incredible route my packets are taking, that might be | it. What's funny is that the only way to watch this | service is to use a vpn right now. | vGPU wrote: | They probably just keep a list of known VPN server IP's. | sitzkrieg wrote: | of course it doesnt work but they gotta try clutching | pearls and applying whatever pressure they can think of | on these fronts | wpietri wrote: | Why is this getting downvoted? It seems to me that a lot | of the media-focused anti-piracy tooling is essentially a | performance of toughness to make rightsholder execs | comfortable. Everybody accepts you can't stop piracy | entirely, and nobody's willing to say, "Fuck it, we'll | compete on convenience and strong consumer | relationships," so we all put up with this weird middle | ground of performative DRM and the like. With only the | rare occasional bit of honesty, as from Weird Al: | https://sfba.social/@williampietri/110906012997848549 | at_a_remove wrote: | This is correct. Imagine in the days of yore, some two | decades and change ago, when I was charged with | implementing putting some music reserves "online" for | streaming ... | | [Harp music, progressive diagonal wave distortions | through the viewport ...] | | We had _two_ layers of passwords (one to get to the | webpage for the class, one when actually streaming via | the client, which was RealPlayer) as well as an IP range | restriction to campus (you live off campus? So sorry) | because our lawyers were worried about what the RIAA 's | lawyers would find sufficient in the wake of a bunch of | Napster-baited lawsuits launched at universities. The | material itself was largely limited to snippets. | | I wanted to say, "Calm down, have a martini or something. | College students are just not going to go wild to | download 128 kbps segments of old classical music," but | alas I was not in charge. | reincoder wrote: | https://incolumitas.com/ | | This is my all-time favorite article: | https://incolumitas.com/2021/11/03/so-you-want-to-scrape- | lik... | | I used to do freelance web scraping, and that article felt | like some kind of forbidden knowledge. After reading the | article, I went down the rabbit hole and actually found a | Discord server that provided carrier-grade traffic relay from | a van which contained dozens of phones. | | For the questions..... we have to kinda wait a bit, someone | from our engineering team might come here and reply. | | By the way, as I have you here have you considered converting | the CSV files to MMDB format? I was planning to do that with | our mmdbctl tool later today. | | https://github.com/ipinfo/mmdbctl | sambazi wrote: | > I used to do freelance web scraping | | "don't sell warez" | voltagex_ wrote: | Can your probes be identified and blocked? | kube-system wrote: | iptables -A INPUT -p icmp -j DROP | chaps wrote: | This isn't helpful. The comment was specifically asking | about the probes, not ICMP traffic. | kube-system wrote: | Anybody can do this same thing, if you're worried about | this, you probably don't want inbound ICMP. | chaps wrote: | Cool. Thanks. But let's say I do. | kube-system wrote: | Then there's nothing you can do. If you respond to pings, | then others can take note of the responses you send. | chaps wrote: | You're missing the point that the question is effectively | asking for a list of hosts that they can block. | | Edit: they provided a method: | https://news.ycombinator.com/item?id=37510063 | kube-system wrote: | I understand that was the initial question. I am saying | that is a fools errand. Anyone with a few VPSes, a | calculator, and a map can do this. It isn't just | ipinfo.io doing this. There are a lot of ip geolocation | services. | j16sdiz wrote: | This breaks PMTU and is the source of many mystery download | stalls | eptyc1 wrote: | Indeed. Openwrt for some reason defaults to reply to pings. | I see the value of ICMP for servers, but I don't see the | value for home ISP routers. | | I disabled ICMP reply on my home router. | sambazi wrote: | > Openwrt for some reason defaults to reply to pings. | | it's a bit like greeting-back ppl on the street. | | not doing it will not make you invisible. it will break | somebody's assumption of decency, but most ppl don't care | either way. | voltagex_ wrote: | http://shouldiblockicmp.com/ | | (But the guy running the probes is making a good counter | argument) | reincoder wrote: | It is just ping data. We ping an IP address, get the RTT, | draw a radius on the globe, and say that the IP could be | anywhere inside that radius. Then we do another ping and draw | another radius, and at the cross-section of the two radii | could be your IP address. Now, if we do it enough times, we | can get an estimate of where the IP address is located. | | The data is not derived from the IP address itself, but | rather from the process itself. And it's just a ping. | Moreover, the majority of the IP addresses are not pingable. | So, we rely on other in house statistical and scientific | models to estimate the location. The probe infrastructure is | extremely complicated and there are billions and billions of | IP addresses, which is why we do not have a robust range | filter mechanism. | | You can implement a dynamic ping blocking mechanism or use | our data to find hosting ASNs and block ranges of those ASNs. | You can download the database for free: | https://ipinfo.io/developers/ip-to-country-asn-database | spacedcowboy wrote: | So, at the risk of outing myself, I wrote http://www.hostip.info | a long time ago* which used a community approach to get ip | address location ("is this guess wrong ? Fix it please"). | | The last time I checked (maybe a decade ago [grin]) it worked | pretty much perfectly for a country, imperfectly for a region, | and better-than-a-coin-toss for city resolution. All the data is | free. | | I don't think they have it on the site any more, but I used to | have a rotating 3D-cube thing (x,y,z were the first 3 octets of | the address) for things like known-addresses, recent lookups, | etc. I used different colours for different groups (country, | continent,...) It was so old it was written as a Java applet. | Yeah. I guess if I were to do it again, it'd be WebGL. | | -- | | *: I sold it a long time ago, with the proviso that the data must | always remain free. I actually didn't believe the offer at first | (it came as an email, and looked like a scam) but it went through | escrow.com just fine, and I think we both walked away happy. That | was almost 2 decades ago now though. ___________________________________________________________________ (page generated 2023-09-14 23:00 UTC)