[HN Gopher] You don't want to be on Cloudflare's naughty list ___________________________________________________________________ You don't want to be on Cloudflare's naughty list Author : merlinscholz Score : 349 points Date : 2022-09-20 14:26 UTC (8 hours ago) (HTM) web link (www.ctrl.blog) (TXT) w3m dump (www.ctrl.blog) | yamtaddle wrote: | Harsh blocking/limiting/challenging is way too valuable to sites | that are actually trying to make money online. It's not going | away short of legislation banning it. Losing 1/10,000 legitimate | customers to cut fraud attempts, spam, exploit attempts, and so | on, by 90% or more, is just too good a trade-off. | | I have bad news about the most-likely fix for it, longer term, so | we can lay off the IP-based reputation stuff and the geo- | blocking: it's tying some form of personal ID to your browsing | activity, so _that_ bears the reputation instead of the address. | | Sorry. Said it was bad news. | Waterluvian wrote: | I think this is true. It also reminds me of one possible | purpose of regulation and government, given the majority will | usually be happy to throw any sort of minority under the bus | for the "greater good." | | This also reminds me of the anxiety of Google deciding to just | ban my account for some reason. They can't be bothered to | commit resources to making sure mistakes can be resolved. They | don't care to lose a fleetingly small percentage of customers. | | Not sure I have an answer. Just a thought. | akira2501 wrote: | > Harsh blocking/limiting/challenging is way too valuable to | sites that are actually trying to make money online. | | I'm not understanding the generalized sentiment here. How | would, for example, a retailer benefit from this strategy? How | does it protect their bottom line? | | I can see how a particular kind of "facilitated user economy," | such as games, gambling and promotional companies could | benefit, but it doesn't seem that broadly applicable to what | most people would consider a "mainstream" business. | | > so we can lay off the IP-based reputation stuff and the geo- | blocking: it's tying some form of personal ID to your browsing | activity | | And a new market for identity theft is born. | | Also, as someone who serves content and geo blocks it, that's | not up to me, that's up to the owner of the content or whoever | happens to be licensing it for them. So, even if you sent me a | picture of your government ID, it changes nothing. | les_diabolique wrote: | > a retailer benefit from this strategy? How does it protect | their bottom line? | | A couple of examples I can think of is blocking bots from | scraping their site for pricing and details and from | resellers from buying up all of the stock (see sneakers, | electronics, etc). The last example doesn't directly impact | their bottom line, but it will make customers go elsewhere. | yamtaddle wrote: | > I'm not understanding the generalized sentiment here. How | would, for example, a retailer benefit from this strategy? | How does it protect their bottom line? | | The amount of automated _and apparently-manual_ attempted | credit card fraud (and exploit attempts, for that matter) any | halfway-prominent site with a CC form is subjected to is hard | to appreciate if you 've never seen it. It's _a whole lot_. | They aren 't even necessarily trying to buy what you have, | but to validate that their stolen cards work. And they're | quite busy. If too much of that gets through--really, any | more than a _very_ tiny amount of it gets through--you 're | gonna have an extremely bad time. | | Various CC service providers like Stripe do provide tools to | try to block those attempts, but defense in depth is usually | a very good idea, including fairly aggressive firewall-level | blocking. | hot_gril wrote: | The other not-so-great approach is to act like a normal user. | This stuff doesn't tend to happen to the average Joe who | browses the WWW. It's when you're doing unusual (albeit | harmless) things. | jabbany wrote: | An alternative that preserves some privacy also doesn't seem | that hard to imagine... though it probably has its own can of | worms*. | | Basically, the core problem is digital identities (accounts, | IPs, phone #s etc.) are cheap to create (even considering | captchas and all) so fraud is easy. The solution could be just | to make it "costly" to create new digital identities. For | example, you could get a "verified but anonymous" identity | issued by locking some assets (could be real world money, or | maybe something intangible like community reputation) as | collateral with a trusted party (or, for the crypto people, the | blockchain). If you misbehave, you lose your reputation on that | identity (and essentially your collateral) and have to start | over. This lets anyone bootstrap a "minimal" level of trust at | the beginning before they can use time to prove themselves | trustworthy. | | Note: This model might remind some of things like staking in | crypto. However the idea is really not anything new... Putting | money on the line is really how most low-trust bootstrapping | happens. | | *: To name a few:(1) this can result in participation being | gated by wealth, which can be unfair. (2) it makes accounts | more valuable to hack so people need better security practices | [re: twitter checkmark]. (3) one would need some authority to | decide how accounts lose their collateral or maybe the | collateral is just burned to create that initial credibility... | mhink wrote: | > Basically, the core problem is digital identities | (accounts, IPs, phone #s etc.) are cheap to create (even | considering captchas and all) so fraud is easy. The solution | could be just to make it "costly" to create new digital | identities. For example, you could get a "verified but | anonymous" identity issued by locking some assets (could be | real world money, or maybe something intangible like | community reputation) as collateral with a trusted party (or, | for the crypto people, the blockchain). If you misbehave, you | lose your reputation on that identity (and essentially your | collateral) and have to start over. This lets anyone | bootstrap a "minimal" level of trust at the beginning before | they can use time to prove themselves trustworthy. | | I've always thought that client certs would be an interesting | solution to this problem. Any given certificate can carry | signatures from multiple signing authorities, right? So we | could imagine a world where there are many different | certificate authorities, each of whom have their own criteria | for signing a particular certificate and each of whom offer | different varieties of assurance regarding the signature- | holder's identity. | | From here, the question of "should I allow the user | identified by this client cert to use my service" simply | becomes a question of 1.) checking the validity of the | signatures of the client cert and 2.) deciding if the CA's | criteria for signing certs aligns with my desired userbase. | | For example, a particular CA might insist that their users go | through some real-world process to renew their certification | every few years, but when they sign a cert it means that the | bearer has been strongly vetted as a real person. | | An interesting side effect of this auth model is that a | service provider accepting certs from a particular CA has | someone to complain to if a user bearing their signature acts | improperly on their platform. You could imagine a CA which | has a code of conduct expected of the users whose certs they | sign, and would perhaps revoke a user's certification if too | many websites complain. | unwise-exe wrote: | That's not safe for a lot of sites, though. | | I hear that porn tends to be officially frowned on in a | fair number of places. | | Reading non-approved news is dangerous in some places. | | Honestly _debating_ political topics can be super dangerous | if you 're identifiable. | | Sometimes even having a login on a site is dangerous, I | think I heard about this after a non-mainstream discussion | site got hacked like a hear and a half ago. | georgyo wrote: | Your idea is comes from a good place, but identity theft is | already a thing in the real world. Digital identities would | also be very stealable. This malware more harmful in the long | term. Imagine if your Twitter gets hacked and your digital | identity makes it so your Gmail gets blocked. | | Similar, the internet is already very difficult for the | people with limited means. This would make it even harder. | [deleted] | smsm42 wrote: | They are already testing out digital IDs. Now link that to the | social score... and make the browsers and the sites exchange | these data on the background, and make frontend services | providers refuse connections from non-supporting browsers as | "bots"... | tboyd47 wrote: | How does having a personal ID tied to browsing activity help | with spam? Are spammers not real people with IDs? | les_diabolique wrote: | Spammers typically implement bots to carry out tasks. I mean, | technically at some point a spammer is a real person, but | when you're automating tasks and using bots, it's not at the | same scale. | notsapiensatall wrote: | So what happens when your ID gets hacked and reused for | fraudulent activity? | | Would you have to submit a dispute with the internet credit | agencies? Maybe join a class action suit against the entity | that leaked your ID so that they're forced to give you a | year of free internet identity monitoring? | jamie_ca wrote: | Then you need to deal with levels of rate-limiting that | are fine for individuals but make it not feasible for | spammers. | | Keeping with the cloudflare topic, if Cloudflare only | permits you 10 requests per second (HTML + JS/images) | that's still usable for web browsing, but someone running | a cloud of hundreds of bots would be effectively shut | down. Similarly with email, an individual probably | doesn't need to send more than one email per 10 seconds | but email spammers wouldn't find any ROI at that rate - | business needs being different might necessitate a | different registry or something in that case. | smsm42 wrote: | The same that happens now when somebody stills your | identity and ruins your credit history. You'll have to | live in a bureaucratic hell for the next couple of years. | And yes, as a compensation, you'll get the $6.99 worth of | services from the guilty party. If you win the class | action suit, that is. | notsapiensatall wrote: | Exactly. Why on earth would we want to replicate such a | terrible system online? | | We should be reforming our current credit agency system, | not empowering it with a new mandate of judging | somebody's social or political creditworthiness. | mcguire wrote: | Nobody said it wouldn't suck. The only question is | whether it sucks less than the alternatives. | adamckay wrote: | Of course, but the theory is it's restricting 1 real person | to 1 account, versus 1 spammer creating 1,000 accounts via | automation. | | And once your spammer has been identified then that's them | banned/removed, unable to sign up again. | tboyd47 wrote: | What's to stop them from using fake IDs | thayne wrote: | I haven't experienced it as badly as the author. But I find the | cloudflare page checking that I am using a "secure browser" very | frustrating. I seem to get it the most for gitlab pages for some | reason. | marcus_holmes wrote: | I use a VPN, for perfectly legitimate reasons (I travel a lot, | and most internet services assume that your IP address also | indicates your nationality, citizenship, language, bank account | country, etc. Being able to change IP source country is vital). | | Some VPN exit addresses have obviously been flagged as "bad" by | Cloudflare and I get challenged with CAPTCHAs from some | countries. It's an interesting experience, but luckily my VPN | provider has enough exits that I can usually switch to one that | has better reputation with Cloudflare. | | Obviously, none of this is helping the internet be a better place | from my point of view. I get that it's part of the ongoing fight | against bots and spam, but it always feels so arbitrary. IP | addresses are interchangeable, folks - they say nothing about the | nature of the request. Or rather, for a large majority they do, | but there's us minority that don't obey those rules and resent | getting caught up in it. | jasonlotito wrote: | Yeah, this just continues to reinforce my opinion Cloudflare. | It's not something I would ever recommend, and there are numerous | other superior options out there. I see Cloudflare failing | frequently enough that if it were something I was responsible | for, I'd be embarrassed at the very least. | tire-fire wrote: | What superior options would you recommend that are privacy | focused and free? | dedward wrote: | I'm curious if you've had experience with their enterprise | package? | | I can understand people's gripes about things on the free/cheap | packages, where Cloudflare makes decisions for you, sometimes | ones you don't like. | | But as an enterprise customer, I've never found it to be | anything short of fantastic - I can tailor it to behave exactly | how I want, and not interfere with my customers. | johnklos wrote: | Your response seems to ignore the very article being | discussed. | | Or are you suggesting that if you're having trouble visiting | sites because of Cloudflare, you should become an enterprise | customer? (slightly sarcastic, but not completely) | dedward wrote: | My response is simply trying to understand where you are | coming from. You've mentioned there are numerous superior | options and you would never recommend it. | | I'm wondering (genuinely!) if you are speaking as an | enterprise customer or a free plan, or what.... both for | the sake of meaningful discussion and potentially learning | about even better options for my own work. | | As to the article - I fully believe the responsibility lies | with site owners to pick and choose how they want to serve | their sites. Nobody is forcing them to use Cloudflare on a | free plan, or to ignore any analytics it provides and make | sure it is serving their customers correctly. Cloudflare is | one piece of a delivery solution, and only works as well as | you configure it. If your decision for your app is "I'll | just use the free plan, and let Cloudflare decide | everything for me" then you get what you pay for. | | If Cloudflare is getting in their way, they can go | somewhere else. | DethNinja wrote: | There is a chance you might've been hacked. | | You would be surprised to see how easy it is to hack domestic | routers. | | 1. Find and disinfect the devices, including the router. If you | don't have enough technical knowledge, then buy a new router. | | 2. Use 30 character long random password on the router. | | 3. Disable UPnP. | | 4. Anything with WI-FI and weak password can be hacked within | minutes, so check your other devices as well, especially IOT | ones. | mh- wrote: | My assumption is also that something on his network is | compromised, and getting his IP into reputation issues. | | Tarpitting (serving content slowly from the edge, in order to | slow down bots) is necessarily one of the most expensive tools | in a WAF/CDN's toolbox. | | It's _much_ more likely that something on his network is | sending sketchy traffic to CF-fronted /Google sites, and the | slow loading he's experiencing elsewhere is because his | upstream is being saturated by whatever is happening on his | network. | d2wa wrote: | (Author here.) My router isn't a domestic router. It's a | MikroTik running RouterOS, completely unsupported by the ISP. | Outgoing connections and DNS is logged. UPnP is only allowed | for the Xbox, PS4, and off-most-of-the-time gaming PC. Nothing | out of the ordinary in the logs. | alexforster wrote: | > It's a MikroTik running RouterOS | | https://google.com/search?q=mikrotik+botnet | | These things are the absolute scourge of the internet. | aaronmdjones wrote: | > It's a MikroTik running RouterOS | | It's almost certainly compromised. | malfist wrote: | Why would you disable UPnP? You're gonna break most | collaboration tools/video games/etc. | kunwon1 wrote: | Disabling UPnP doesn't break much. I've used enterprise | firewalls at home for years, none of them have UPnP, I've | never noticed a problem arising from that lack. I don't have | a problem with video games or collaboration tools | | UPnP allows devices inside your network to open ports to the | outside world without your knowledge. I think everyone should | avoid it if they can get by without it | d2wa wrote: | It's absolutely required for most multiplayer games. Many | need random ports and some even refuse to work if UPnP is | blocked even if you manually open a port for them. | aaronmdjones wrote: | I've never had UPnP enabled and I don't have any problems | doing online gaming / flight sim / video chatting / etc. | malfist wrote: | What's your solution for the grandmother who just wants to | make a zoom call to her grandson? Have her log into her | router portal and setup a static ip for her laptop and then | port forwarding routes for zoom? | Karrot_Kream wrote: | STUN servers? Also, while I (not GP) do think UPnP is | dangerous, I also think it's only something you disable | if you _know_ you can live without. | thayne wrote: | I don't think zoom uses UPnP. If it did, that would cause | problems on corporate networks that typically have UPnP | disabled. | [deleted] | zinekeller wrote: | To be frank, that's exactly the problem with NAT-PMP et al. | assuming that there's no router bugs: the ability to forward | ports has been abused to set up bot relays on hacked IoT | devices. This is why I predict that even in IPv6 era we would | still have to rely on a TURN-equivalent. | malfist wrote: | That's exactly the problem with NAT-PMP? | | So what's your alternative for peer to peer connections? | Static routing that the common end user can't figure out? | Re-centralize connections? | | UPnP is necessary. | zinekeller wrote: | I'm simply pointing the problem, a real-world an | realistic problem, and you're acting like it's a non- | issue. Point me a CGNATted network which has enable port | forwarding. Does it break a lot of things? Oh, | absolutely. Did the carriers still not activated it? Yes. | Automatic port forwarding is only beautiful when you know | how would your device react. It's ugly when you're a | network administrator who don't control all devices. | | There is no "perfect" solution here because the real | world is a messy place with devices that you cannot | personally vouch for. | nuc1e0n wrote: | This story shows Cloudflare is now harming legitimate users, is | an effective monopoly and as such should be broken up. | jabroni_salad wrote: | Do you have an ISP-provided email account that you never check? | You might want to check it to see if you have any botnet | notifications. | simple-thoughts wrote: | There's a real lack of education I've seen in developers for | small projects who go directly to cloudflare for anything and | everything. They don't understand that they are immediately | losing a large chunk of their user base who is either from the | third world or is privacy literate. Devs working on projects that | are targeting those groups need to understand the tradeoffs from | using cloudflare. | shiomiru wrote: | If you'd like to experience this treatment first-hand, try | surfing the web using the Tor Browser. | | Spoiler alert: many websites simply refuse to load at all (e.g. | any google service, and lots of websites "protected" by CF). | Captchas are everywhere: in many cases, you can't even complete | simple GETs of blogs without donating free labor to CF. | | And the most infuriating part, you get CF marketing messages | right in your face while your browser is calculating hashcash (I | guess?)... At this point I can recognize every single one of | them: something about bots making up 40% of all internet traffic, | something about their web scraper protection racket, something | about small businesses (???), etc etc... | | To be fair, Tor exit nodes have an awful reputation for sure. | Nevertheless, I have a hard time forgiving how CF makes browsing | the Internet hell for those who actually need Tor. | yjftsjthsd-h wrote: | > And the most infuriating part, you get CF marketing messages | right in your face while your browser is calculating hashcash | (I guess?)... At this point I can recognize every single one of | them: something about bots making up 40% of all internet | traffic, | | Yeah, there's something amazingly aggravating about CF telling | you how much traffic is bots _while showing that they can 't | distinguish you from a bot_. | robocat wrote: | CloudFlare are creating a new devision for advertising to | bots. They have projected that in the near future, bots will | be 90% of spending, so the bot demographic is the most | important to target, marketingwise. | | The fact that humans are seeing the traffic meant for bots is | an unfortunate side-effect. | | I personally welcome our future bot overlords (not only | because being unwelcome might be unhealthy for me -- why | would I publicly disagree with an overlord or not want to be | their friend?). | jasonfarnon wrote: | I routinely use Youtube with Tor. I will occasionally get | kicked off with a "suspicious traffic" message, but it isn't my | experience that it "refuses to load at all". | synthetigram wrote: | Cloudflare has mixed up the definitions of "bot" and "abuse". | Tor users may or may not be bots, but as long as they don't | abuse (spamming or DoS), they ought to be treated the same. | sampa wrote: | If an ordinary user would have to deal with google/CF bs everyday | as I do, they'd burn their computer. | | PS Proud user of Firefox + resistFingerprinting=true PPS Ain't | nothing better than CF guard page constantly-reloading on 20% of | sites if you open some url :( No, fella, you first have to open | the root '/' page so that guard page finally can either pass me | through or show the cloudflare captcha. Ugh. Progress, they say. | mikessoft_gmail wrote: | johnklos wrote: | Imagine all the people in countries deemed less desirable by | Cloudflare that go through this all the time. Cloudflare, whether | it's their stated goal or not, is re-stratifying and re- | centralizing the Internet because of their desire to be a | monopoly, and we'll all suffer as a result. | andrewnyr wrote: | there are multiple other large CDNs out there... its a lot more | like 5 market leaders tbh | johnklos wrote: | But how many of them: | | 1) refuse to take responsibility for content they host by | claiming they don't host | | 2) discriminate against huge parts of the Internet with no | publicly known rules, nor methods to change that | discrimination | | 3) make the abuse reporting process intentionally difficult | and time-consuming | | 4) want to aggregate all the DNS data they can by making a | deal with Firefox to turn on DNS-over-https by default | without asking or even informing end users | | 5) want to re-centralize the Internet, in part so they can | mix bad actors with good, in ways that make blocking next to | impossible | | How many of them do the discrimination we're all writing | about here? | easrng wrote: | tbh I think one of the very few positives of having so many | sites going through a few CDNs is that you can make it | impossible to block a protocol or site without significant | collateral damage, which can be a good thing, things like | Tor's meek bridge rely on that. | andrewnyr wrote: | 1) refuse to take responsibility for content they host by | claiming they don't host >CDNs don't host content, they | proxy it | | 2) discriminate against huge parts of the Internet with no | publicly known rules, nor methods to change that | discrimination >Not large parts of the internet, scammy and | attacky parts of the internet. If the rules were public | they wouldn't be effective. | | 3) make the abuse reporting process intentionally difficult | and time-consuming >simply untrue, every abuse report i | have filed has had an answer back within 24hrs | | 4) want to aggregate all the DNS data they can by making a | deal with Firefox to turn on DNS-over-https by default | without asking or even informing end users >this is a good | thing as they are audited as having not keeping logs of dns | queries | | 5) want to re-centralize the Internet, in part so they can | mix bad actors with good, in ways that make blocking next | to impossible >again every cdn centralizes the internet, | and many sites need this protection | PaulHoule wrote: | The rise of Cloudflare is the first real threat I've seen to | ordinary people running webcrawlers. | mschuster91 wrote: | Tragedy of the commons, unfortunately. There were a bunch of | cases where web crawlers and scrapers built competitive | services on the back of the services they scraped, some of | these ending up in courts [1]. | | [1] https://www.derstandard.at/story/1389860104020/eu- | gerichtsho... | lorey wrote: | Which is in turn a threat to the open web in general. Could not | agree more. | adamsb6 wrote: | Does the author have a fixed IP? | | If not, figure out how to get a new one and see if the blocking | recurs. If it does, the bad activity is probably coming from | inside the house -- or CloudFlare has a way to identify you | across an IP change. | d2wa wrote: | The author, me, does have a dynamic IP, but it only changes | once every two years or so. | digitailor wrote: | Not saying that this is the case here, but this may be possible | due to having a bad tab open. Especially over cellular. Haven't | looked into it with any depth, but I've had correlations on a | much shorter timeframe. Suddenly, CloudFlare and/or Google start | questioning my humanity, so I close all tabs. Then okay. Sloppy | hypothesis with no evidence: JS gone haywire | rubyist5eva wrote: | Cloudflare is on _my_ naughty list. I actively advocate against | people using them. | robjan wrote: | I have two dedicated home internet IPs (one iCable fibre and a | China Mobile 5G fallback/quarantine WiFi) and get these "checking | if your internet connection is secure" interstitials all the time | now. Also see them on my HKBN work connection. | | I'm from Hong Kong and suspect the whole territory is on the | naughty list. | JCWasmx86 wrote: | Couldn't you use e.g. the DSGVO/GDPR in the EU to get all the | information about your IP, everything cloudflare has stored about | it until you find the root cause? | unity1001 wrote: | Its amazing how Cloudflare became another tech monopoly that can | decide the lives of ordinary people in a totally unregulated, | private fashion. | therealmarv wrote: | If you surf on desktop sites from Philippines on a mobile phone | plan (which is often the best Internet connection in that | country) you also get Cloudflare's captchas everywhere. | | I told it before and tell it now again: Cloudflare is dividing | the World between first and second/third World countries with | their captchas. I call it discrimination of second/third World | countries! If you are from US and Europe you will never notice it | but if you travel a little bit more you see these blocking | captchas everywhere. | chrismorgan wrote: | I've had a similar experience in India with wired internet from | a local ISP: CGNAT is used so there are who knows how many | customers on the same IPv4 address, | https://iknowwhatyoudownload.com/ shows at least forty hours of | movies being downloaded every day, the IP address is on half | the blacklists out there because _someone_ is part of an email- | sending botnet, and yeah, Cloudflare hates you. | Jamie9912 wrote: | Maybe your mobile ISPs dont do enough to stop malicious/spam | traffic. That's not Cloudflare's fault | therealmarv wrote: | It only affects Cloudflare hosted sites though. | thewebcount wrote: | I get it browsing from a major ISP in the US. I have the gall | to browse in private mode and to block trackers and ads because | of all the malware they contain. (And I don't use a browser | that requires me to login just to browse the web - gasp!) And | apparently, that means I'm worthy of this sort of punishment as | well. | aendruk wrote: | The other side of this story is that PLDT stands out from other | residential networks as a persistent source of web form spam. | I'd love to learn what's going on differently there. | Dma54rhs wrote: | I get these a lot and I'm from EU. But it's "seasonal". | ReptileMan wrote: | I am from Europe and I notice if I use some non residential ip. | The captchas are extremely annoying especially when trying to | access a site I have already been logged into with 2fa. Who is | protected in this case. | cft wrote: | I actually think that Cloudflare is setting up the foundation of | Chinese style (but privately outsourced in the US case) | censorship machinery in the US. Between their AI erroneously | flexing its power, Kiwifarms scandal and similar, they are | emerging as a rival to Google in its censorship effort. One of | the most dangerous companies in the internet. | smsm42 wrote: | So this gets me thinking. We know Cloudflare will boot a site if | they really don't like them. Now, what happens if Cloudflare | doesn't like _you_? I mean, really really doesn 't like. Maybe, | you said something wrong online or participated in a wrong group | activity, or something like that. Is it the case that they have | the power to essentially deny you (provided you have a static IP | and don't use VPN, say) access to a major part of the Internet? | And you can do absolutely nothing about it? | | I know they haven't done anything like that yet. But the | technical capability is there, and we all know how short is the | distance between technical capability and doing it, when the | appropriate pressure is applied. So I wonder, how long before | activists start demanding for CF to boot people from the | internet, and how long before CF caves in to that... | [deleted] | neurostimulant wrote: | If your ISP is using CGNAT, sooner or later you'll going to | experience this problem. When this happen, I had to use a VPN (I | use mullvad) to reduce the amount of cloudflare challenges I get. | Pretty funny because usually I got more challenges when using a | VPN instead of the other way around. The Privacy Pass extension | also seems to help a bit. | jgrahamc wrote: | _Well into the second day of Cloudflare's blockade of my home | internet connection, Google Search also began blocking requests. | It required me to resolve a CAPTCHA challenge for every other | search. This luckily only lasted a day._ | | _Cloudflare shares IP reputation data with partners like Google, | coordinated through a program called the Bandwidth Alliance. So, | my original offense might not even have been against Cloudflare. | It might have received the reputation data from a partner, and it | just propagated through the Bandwidth Alliance network._ | | That's not what Bandwidth Alliance is at all. It's about reducing | or eliminating egress fees between a cloud provider and | Cloudflare. Not sure where the idea that it's about sharing IP | reputation data comes from. | | https://www.cloudflare.com/bandwidth-alliance/ | | So, if Google Search started showing a CAPTCHA that's not | Cloudflare. | xani_ wrote: | > Not sure where the idea that it's about sharing IP reputation | data comes from. | | Probably from scam called mail blacklists | [deleted] | plumeria wrote: | It is interesting that the Bandwidth Alliance partners list | shows pretty much every big cloud provider except AWS and | Akamai [0] | | [0] https://www.cloudflare.com/bandwidth-alliance/ | throwawayays wrote: | The tone of this reply is a bit shit from a PR perspective. | | How about _also_ pointing to a knowledge base article for how | an end user could go about working out what network activity | from their IP might be flagging Cloudflare's systems? | phantom_of_cato wrote: | But that's beside the main point. You guys are essentially the | "single point of failure" for half the internet. [1] Being | competent and smart doesn't really help too much, as | demonstrated by how you guys had to give in to the pressure to | censor recently. | | [1]: https://easydns.com/blog/2020/07/20/turns-out-half-the- | inter... | TakeBlaster16 wrote: | Can you acknowledge the main point of the article? What should | someone do if they find themselves misclassified by | Cloudflare's systems? | mh- wrote: | _(not the parent commenter)_ | | That person should start with the assumption they _haven 't | been misclassified_ and eliminate the possibility that a | device on their network is compromised. | JohnFen wrote: | A task that would be made much easier and less likely to | miss something if the affected person had some indication | as to what the problem was. | buildbot wrote: | Devil's advocate - would it not then be pretty easy to | engineer malicious bots to avoid detection? | d2wa wrote: | (Author here.) That's missing from the article. But I have | logs of the network. There's nothing out of the ordinary. | "I don't know what I did wrong," as I started the article, | means "I've checked logs and such and there's no indication | of anything wrong on my end." | tomxor wrote: | FYI, this guy is far from alone, your "protection" has given me | a lot of grief over the past few years, particularly on highly | NATed mobile networks. | | I've been gradually removing cloudflare based CDNs from | services I develop and control because I don't want my users | being arbitrarily discriminated against. | | There was a good article posted on HN recently titled "The | ideal level of fraud is non-zero" which I think is highly | relevant here... In essence any mechanism employed to prevent | illegitimate use comes with a negative cost to legitimate | users, if that cost is too high it defeats the purpose. i.e | what's the point in a website that is completely immune to a | botnet and also cannot be accessed by anyone else? unplugging | the ethernet cable also effectively protects against botnets. | More subtly the cost of outright rejecting some legitimate | users is usually not worth the savings of rejecting 100% of | illegitimate ones. I think Cloudflare's service has it the | wrong way around: it currently accept blocking legitimate users | far too easily, that is not an acceptable cost; whereas you | should be letting a higher level of bots through to avoid | pissing off legitimate users - if it's not obviously a DDoS, | it's probably worth the bandwidth cost. | | Consider the bigger picture, if you save a slither of a penny | by blocking a bot, but also end up blocking or seriously | inconveniencing 10 real users... is it worth it. | dmix wrote: | Cloudflare just isn't worth the tradeoffs: the risks | associated with their centralization, how they made Tor | basically unusable on non-onion sites, the lack of | transparency when content-moderating the internet, etc. | | The space is in need of solid competitors to break the | stranglehold they have on the internet. Whether it's the | right combination of services, documentation, etc. | thaumaturgy wrote: | Tor made Tor unusable on non-onion sites. I feed a | netfilters table with the list of exit node IPs that Tor | publishes (https://check.torproject.org/torbulkexitlist) as | a standard part of server deployment, and it's the single | most effective way to reduce form and login abuse on hosted | sites. I like the idea of Tor, but there's no denying that | it's a huge source of nuisances. | shaky-carrousel wrote: | I live in a country with censored internet. What you are | doing is harmful. I can only hope whatever you provide is | irrelevant enough. | thaumaturgy wrote: | I'm sorry. I have a colleague based out of Venezuela. | We've had to work together to get tunnels and vpns | configured so that he can get uncensored and secure | internet access. | | But Tor is an enormous source of abusive traffic and if I | don't filter it, then that's harmful to site owners. I'm | being forced to choose between the needs of people that I | know, work with, and depend on financially, and the needs | of people in countries with issues that are far outside | my ability to resolve. It's not a hard decision. | justsomehnguy wrote: | > It's not a hard decision. | | Depends on what you imply under 'hard'. | | As a IaaS provider I endured alk the hurdles about that | and ten years later - I don't care, at least not until my | outbound bill is bigger than usual. | | Like some of the clients are on CentOS6, on a public | facing machines. | parroteal wrote: | I'm a noob, can you give me a pointer? | | What kind of abusive traffic is coming through Tor and | why do they do it? | remus wrote: | Say you're running an account take over script that spams | login forms with a list of known username and password | combos. If a website owner sees thousands of login | attempts coming from a single IP address they're likely | to block you to prevent abuse on their website. This is | annoying for you as you then need to rotate your IP | address. | | Using tor hides your IP address from the website and | makes switching exit nodes very straightforward, so you | can run your account take over script in peace. | thaumaturgy wrote: | Mainly forms -- login forms, comment forms, signup forms. | Bots use Tor pretty heavily because it's anonymous and | hard to block them without blocking the entire network. | Login form abuse is mildly irritating but not a huge deal | if you have other measures in place. Comment spam is | annoying but there are some options that deal with it | pretty well. | | But the signup spam was a headache. I didn't want to just | blackhole Tor traffic, and tried to reduce the abuse with | other tools, including some custom stuff. The final straw | was a customer's small business site that had a MailChimp | or Constant Contact signup form. Those vendors want you | to embed their code by default to render the form, so you | have less control over the form itself. There were | workarounds, but they all sucked. | | Tor bots would sign up email addresses through this | newsletter form, and then I'd have to go through and | manually scrub them before newsletters went out, or the | service would penalize my client for too many | bounces/unsubscribes/complaints. Very nearly 100% of the | abuse on that particular form came from Tor IPs. | | I do not want to spend my limited time on this Earth | manually sorting out bots from humans because of one | particular network. Blackholing Tor made that problem | disappear immediately. | | VPNs are dime-a-dozen now, cheap VPSs are available from | lots of vendors, there's Wireguard, there's ssh, a clever | person could even set up Apache or nginx as a forward | proxy with ssl from LetsEncrypt. Tor is well over 90% | abusive traffic (https://blog.cloudflare.com/the-trouble- | with-tor/). This is a Tor problem, not a me problem. | There are better alternatives available. | judge2020 wrote: | https://blog.cloudflare.com/the-trouble-with-tor/ | | > . Based on data across the CloudFlare network, 94% of | requests that we see across the Tor network are per se | malicious. That doesn't mean they are visiting | controversial content, but instead that they are | automated requests designed to harm our customers. A | large percentage of the comment spam, vulnerability | scanning, ad click fraud, content scraping, and login | scanning comes via the Tor network. To give you some | sense, based on data from Project Honey Pot, 18% of | global email spam, or approximately 6.5 trillion unwanted | messages per year, begin with an automated bot harvesting | email addresses via the Tor network. | Zak wrote: | There are probably more sophisticated options that would | solve your problems than simply blocking it. | plumeria wrote: | Is using CAPTCHAs one of those? | judge2020 wrote: | Such as? | cowtools wrote: | The answer depends on the type of service you host. I | don't know what you need to do, but I do know that | filtering IP space is merely security-by-obscurity, it is | a cheap and broken solution to the hard problems of sybil | resistance. If you need IP filtering to operate on a day- | to-day basis, then the security of your service is | fundamentally broken. | | Tor users do not have any special properties over clear- | net users besides low accountability for their IP space. | There are other ways to acquire this type of setup that | don't involve broadcasting a public list of known exit | nodes as an act of good faith. Any sophisticated attacker | will be able to easily get ahold of the IP space and | bandwidth they need to do their work, whether it's | through a botnet or simply because they operate out of | some less-accountable country like China or Russia. | | IP filtering: now you have two problems! | thaumaturgy wrote: | This is why I'm strongly against spam filtering for | email. Spam filters are fundamentally security-through- | obscurity. I mean, they don't protect your email from | targeted bombing attacks or phishing. If you need spam | filters to operate your email on a day-to-day basis, then | the security of your email is fundamentally broken. | | /s, obviously, I hope. | | Blocking Tor isn't a security measure, it's a nuisance | reduction measure. | plumeria wrote: | How often is the list of exit nodes updated? | thaumaturgy wrote: | Daily, I believe. I don't have the file git-controlled. | That would be a good idea, though. | andrewnyr wrote: | there are many solid competitors: Amazon, Fastly, Akamai, | Imperva to name a few | wahnfrieden wrote: | Bunny | thaumaturgy wrote: | Just 10 minutes ago, I got the following email from a | housemate (I'm not home at the moment): | | > _The past few weeks I 've been getting tons of redirects to | verify my humanity before being allowed to view a webpage. | Usually I just have to click the box that says human, not | find all the ladders in a photo. SoFi is doing it every | single time I log in. Petco, too, along with others who are | more sporadic. This is happening with and without uBlock on. | Same browser I've always used. ..._ | | SoFi and Petco both use Cloudflare. I do exactly zero web | crawling / scraping / abusive anything from my home | connection. | | I'm noticing a recent increase in volume of complaints about | Cloudflare's human verification filter. I'm starting to | wonder if they touched a dial. | | I had already started pulling some infra back from Cloudflare | after their last appearance in the tech news cycle. Now I've | got an additional reason to continue doing that. | patrec wrote: | > I had already started pulling some infra back from | Cloudflare after their last appearance in the tech news | cycle. | | What triggered your reaction? That they terminated a | customer with zero notice? | tarakat wrote: | You're looking at it all wrong. From Cloudflare's point of | view, this kind of blocking is a _feature_. Anyone doing | legitimate web crawling, or offering alternative web services | such as Starlink, now needs Cloudflare 's permission. | | Essentially, for a broad class of web-based businesses, they | have made themselves gatekeepers. I'm sure they'll find a | profitable use for this position. Charging outright would | look bad, but investing in businesses that just happen to not | run into Cloudflare-based trouble, but whose competitors | do... | tomxor wrote: | I'm familiar with that perspective, and biased towards | it... Cloudflare is certainly in such a position, but they | are a relatively young company (for their size and reach) | and I've seen good things come from them. | | I'd guess the intent is unlikely to be anti-competitive or | monopolistic, just over-aggressive. However regardless of | intent their position does cause an absence of market | forces to put pressure on fixing such issues - Similar to | how it's become acceptable to have downtime when it's on | AWS, because "everyone is affected". | O__________O wrote: | They do have a threat score | | https://developers.cloudflare.com/firewall/recipes/block-ip-... | | I was surprised to learn Cloudflare was born out of Project | Honeypot, so I am guessing Cloudflare does share data with | them: | | https://www.projecthoneypot.org/cloudflare_beta.html | [deleted] | elcomet wrote: | FYI you're responding to the cloudflare CTO | trasz wrote: | It's naive to assume Cloudflare CTO would not be lying if | beneficial to him or Cloudflare. | elcomet wrote: | I don't assume anything. The previous comment was just | trying to teach something about cloudflare to its CTO | nemothekid wrote: | I wonder if HN posters have ever held a job before. Can | you explain why it's beneficial for Cloudflare to block | legitimate users? Why is the simplest explanation | "Cloudflare just hates this one user in particular?" | lmm wrote: | Well, apparently they scared this user into installing | their browser extension, so it sounds like this incident | was a win for them. | Veen wrote: | It's even more naive to assume Cloudflare's CTO would | tell lies that can be trivially shown to be untrue. | trasz wrote: | How would you show they are untrue? Ask? :-D | pessimizer wrote: | Don't use an assumption of someone's superiority solely | based on their job title as a justification for the | silencing of the disagreement of others. | d2wa wrote: | > That's not what Bandwidth Alliance is at all. It's about | reducing or eliminating egress fees between a cloud provider | and Cloudflare. Not sure where the idea that it's about sharing | IP reputation data comes from. | | It comes from the Cloudflare blog. | https://blog.cloudflare.com/cleaning-up-bad-bots/ | | There's a support page about it too. | https://developers.cloudflare.com/bots/get-started/free/ | jgrahamc wrote: | I need to look into that. Thanks for pointing it out. I had | totally forgotten about that post. | | Edit: team tells me this idea never got off the ground. Did | talk with some potential partners (which did NOT include | Google) but didn't happen. So if Google was throwing CAPTCHAs | it wasn't because of our IP reputation. | d2wa wrote: | Dear John. What am I -- as a normal human being/end-user -- | supposed to do in this situation? People can't do anything | without any information about why they're blocked. Who do | you contact? Where do you go? What to do? The challenge | page doesn't help the end user understand why this is | happening to them. It's okay if you only see it for two | seconds. But the page stays on screen for over a minute. | When this happens for every website -- what do you do? | You'd be furious if this had happen to you. I'm just trying | to read my online comics and lookup some stuff about some | interests and hobbies. It reduced my quality of life/sanity | for a week. The last two days, I started worrying that this | was going to be the new normal. I even looked into swapping | ISP to get a new IP address. | | PS: I love all the innovation and engineering stuff you | guys regularly share on the Cloudflare blog. It's [almost] | always an interesting read. Even though I'm no fan of the | massive centralization your company has caused. | JohnFen wrote: | > People can't do anything without any information about | why they're blocked. Who do you contact? Where do you go? | What to do? | | This is the most serious problem with all of the major | companies these days. Cloudflare, Google, Apple, etc. | When you get on their "bad side", you're just screwed. | You'll never even know what got them mad at you, and | there's nothing you can do to recover. | | The only reasonable way to deal with this is to avoid | them all to the greatest extent possible. You have no | control over whether or not you deal with Cloudflare, | unfortunately, which makes them the worst of the lot. | adammartinetti wrote: | > It's okay if you only see it for two seconds. But the | page stays on screen for over a minute. | | That doesn't sound right. You shouldn't see a loading | page for over a minute. If you're open to providing more | details privately I'd love to help troubleshoot. You can | drop me an email at amartinetti @ cloudflare. | jgrahamc wrote: | Once upon a time Matthew made us set the IP reputation of | every Cloudflare office to bad so that we experienced the | worst case scenario. Helped a lot. | | I don't understand why you saw one minute block screens. | That's not right. Should be seconds. | | I'm talking with the team about your other points. | tinus_hn wrote: | The main problem of course, and it isn't limited to | Cloudflare and I won't pretend to have the solution, is | that if you are caught in this kind of web, you have no | recourse but go public and hope the spotlight lands on | you. For every problem we see in an upvoted post there's | tons that nobody sees. | northwest65 wrote: | What about answering his actual question? | easrng wrote: | I haven't been getting challenges that last that long, | but I have noticed that the redesigned "security check" | challenge pages with the spinner do seem much slower than | the old design with the loader that was made of 3 orange | dots. | d2wa wrote: | I edited and added a second link to a support page that | mentions it too. | jgrahamc wrote: | Thanks. I'm talking with the team. | | Edit: see comment above. | cvwright wrote: | You block this guy from the internet for a week --- for no | apparent reason --- and then you come in here with a nitpick | about how another related system works? | | Really? | judge2020 wrote: | The point is that Cloudflare does not beam IP reputation data | to Google. If Google and CF are blocking this IP separately, | what's the chance there's some malicious device or hacked IoT | device on the network, participating in DDOS attacks or | unauthorized vulnerability scanning of random websites? | zinekeller wrote: | Yeah, if for example Spamhaus (which both Cloudflare and | Google consult) has detected that a subnet is bad then that | could be the cause. | | Still, it doesn't excuse Cloudflare that there's no redress | if you are caught on a block or even a clue on what you can | do to reduce it (especially that Spamhaus do have redress | procedures). | cvwright wrote: | Fair point | pessimizer wrote: | According to another comment, it's a wrong point: | https://blog.cloudflare.com/cleaning-up-bad-bots/ | | > Once enabled, when we detect a bad bot, we will do three | things: (1) we're going to disincentivize the bot maker | economically by tarpitting them, including requiring them | to solve a computationally intensive challenge that will | require more of their bot's CPU; (2) for Bandwidth Alliance | partners, we're going to hand the IP of the bot to the | partner and get the bot kicked offline; and (3) we're going | to plant trees to make up for the bot's carbon cost. | judge2020 wrote: | I'm pretty sure this was for a situation like | Digitalocean themselves hosting a bot, but such IP | sharing very well might be currently (ab)used by | partners, if it's happening here. | jgrahamc wrote: | Yeah. I'm looking into that. | stefan_ wrote: | A wrong nitpick, even! Way to look like the asshole. | noasaservice wrote: | Given their business model is "Protect DDoS'ers (booters) so | they can DDoS sites so Cloudflare can sell DDoS-prevention | services", I wouldn't trust them one whit in doing the right | thing. | | And frankly, if you want to dig deeper, just look at who they | have no problems having their free clientele as. | tshtf wrote: | Not sure why this is getting downvoted, it's completely | factual. | cma wrote: | Its like finding the worst videos on youtube and saying | that's their business model. | acdha wrote: | It makes a very broad claim which makes it sound like an | extortion racket but doesn't have anything to back it up. | I would bet that if it included some evidence it would | fare much better. For example, they have a ton of large | organizations which are customers. The very first | question the average reader is going to have is whether | it's really the case that these sites are predominantly | attacked by booter services which use Cloudflare for | hosting? That seems unlikely and as general rule here the | broader the claim the more people are going to expect you | to show that you did your homework first. | [deleted] | gusgus01 wrote: | The claim was discussed in this post: | https://news.ycombinator.com/item?id=32709329 | | Basically DDOS booters use Cloudflare to protect their | websites from competitors, since Cloudflare is one of the | best. The same people Cloudflare is protecting (and | claims to do so on an ethical neutrality basis) is | furthering the need for Cloudflare to exist. | acdha wrote: | Note that I'm not saying whether or not this is true, | only that a comment which links to something like that | will generally fare better than one which begs the | question. | kevingadd wrote: | I'm used to getting assaulted by Cloudflare's browser check | interstitials along with random Cloudflare and Google CAPTCHAs | because (presumably) I run Firefox and an ad-blocker instead of | vanilla Google Chrome. It's already tremendously inconvenient to | wait multiple seconds on many page loads and click 20 bicycles, I | can only imagine how infuriating it would be if every page load | started taking 60 seconds because your IP ended up on some random | algorithmic blacklist.... | 20after4 wrote: | I use firefox and an ad blocker and I don't see these CAPTCHAs | ( except for a few rare instances that I can recall). Something | else must be going on to get you flagged. | leonfs wrote: | If you haven't done anything, someone else might have. Check your | router logs for strange devices and activity in your network, | also check your machine/s for malware. | d2wa wrote: | (Author here.) Plenty of logging of outgoing connections and | DNS. Nothing out of the ordinary. | Jamie9912 wrote: | Is your IP address listed on https://www.abuseipdb.com/ or | any other spam blocklists? | NelsonMinar wrote: | Cloudflare is a regular problem for Starlink users. We're on | CGNAT so users share IPv4 addresses. I see CAPTCHAs when using | Starlink ten times as often as on my other ISP. I don't think it | actually breaks things the way this article describes, it seems | like a gentler behavior, but it's annoying. | | A few months ago I got on Akamai's naughty list (with my other | ISP) for some very light automated website downloading. That was | a straight block with HTTP errors and I had to use a proxy to | access the Web. It cleared up after a few days. | | The lack of any user feedback or support for this situation is | really annoying. Reminds you how much power the CDNs have. It'd | be really bad if loading websites got as difficult as sending | email through all the layers of spam filtering. | Syonyk wrote: | > _Cloudflare is a regular problem for Starlink users. We 're | on CGNAT so users share IPv4 addresses. I see CAPTCHAs when | using Starlink ten times as often as on my other ISP. I don't | think it actually breaks things the way this article describes, | it seems like a gentler behavior, but it's annoying._ | | I've been noticing this too, and it's why Starlink remains my | secondary ISP/bulk transfer connection. If I had to drop one | connection, I'd drop Starlink for this reason alone. | | There are some sites that I simply can't browse, and it's not | Cloudflare errors, either. Lowes, in particular, simply returns | error pages for anything but the main landing page on a regular | enough basis. Of course, my observed public IP changes so it's | not consistent, but it's genuinely annoying. | somedude895 wrote: | > If I had to drop one connection, I'd drop Starlink for this | reason alone. | | Why are you using Starlink at all if you have other options? | Syonyk wrote: | Because my other connection is a 25/3 WISP link that mostly | doesn't. I generally see about 5/1 in the evenings, if | that. | | I've had several area WISP connections, as there's no wired | infrastructure to my area, and they vary in quality. I work | full time remote, so I need two connections as a general | habit - I can work with one, but when that one is down for | a week straight, I have problems. I like being able to fail | over. | | I typically keep one connection for "interactive" traffic, | and one for "bulk transfer/failover" - things like my local | Ubuntu repo mirror, offsite backup traffic, etc. And I can | fail to it if needed, which I do often enough. | | On a good day, Starlink is far better than my WISP | connection, and I have some machines routed out it | persistently. On a bad day, I can't hit much from it, | because that particular public IP has been blocked from | large parts of the internet. It's very hit and miss, and | overall bandwidth has definitely dropped from the early | days, though reliability of getting packets where they need | to go is drastically improved. | cma wrote: | > I've been noticing this too, and it's why Starlink remains | my secondary ISP/bulk transfer connection. If I had to drop | one connection, I'd drop Starlink for this reason alone. | | Could cloudflare legally charge them a bribe to captcha their | users less? It isnt good to have a company in this position | of power if so. | diebeforei485 wrote: | Cloudflare said they're working on this- | https://blog.cloudflare.com/eliminating-captchas-on-iphones-... | ThatPlayer wrote: | I feel like Starlink could at least partially mitigate this by | supporting IPv6. T-mobile US supports IPv6, and I hardly notice | this as an issue on my phone. Or the time my work ran the | business over a 4G mobile while waiting for ISP install. | causi wrote: | What archival tool were you using? I've been looking for a | replacement for HTTRACK forever. | NelsonMinar wrote: | A combination of shotscraper and metascraper; really more web | previews than archives. And in a single thread, to different | hostnames, maybe one every 10 seconds? Honestly surprised | Akamai or anything even noticed. I fake my user agent now, | lesson learned. | justoreply wrote: | But any automated tool won't work. I have a similar problem | with my self hosted feed reader, my vps hosting ip doesn't have | 100% reputation with Cloudflare and I can't download some feeds | | Edit: spelling | btdmaster wrote: | "The data subject shall have the right not to be subject to a | decision based solely on automated processing, including | profiling, which produces legal effects concerning him or her or | similarly significantly affects him or her." | | However, this does not apply if: | | "is necessary for entering into, or performance of, a contract | between the data subject and a data controller;" | | Cloudflare would therefore perhaps claim that this is | "necessary". | grishka wrote: | Here's a handy list of correct uses for IP addresses: | | 1. Packet routing | | In other words, I wish services like Cloudflare were made | illegal. | scarface74 wrote: | Notice that he suspects that some of the problems with podcast | rss feeds and assets that can't be captcha confirmed may be | caused by websites who are on the free tier and that don't have | the ability to specify that some subdomains shouldn't be blocked | by captchas. | | I have absolutely no sympathy for website owners who are | depending on a free service. | ritcgab wrote: | What is Cloudflare? The answer is simple - the biggest MITM on | your Internet traffic. | joshfraser wrote: | If this happened to me, the first thing I would do is switch to | using a VPN. In my experience, Google is far more likely to throw | up CAPTCHA challenges to VPN users. I wonder if this is what | happened to the OP. | superkuh wrote: | Daniel Aleksandersen of ctrl.blog has absolutely no foot to stand | on here. He is a proponent of this kind of algorithmic blocking | for weird browsers and even implemented it on his own site and | argued _for_ it. https://www.ctrl.blog/entry/detect-non-browser- | form-submissi... | | It's only after it happened to _him_ that now he 's suddenly | against it. Until he removes the same type of blocks from his own | website I have absolutely no sympathy for him. | bergwerf wrote: | From the link you mentioned: | | > Bots often mimic the User-Agent of a common browser, but the | version numbers used in the bots rarely change. Over time they | drift farther and farther behind until a point (maybe two-year- | old versions) where you can safely block them without | inconveniencing legitimate users. | | This supports the idea that browsers are subject to constant | change and everyone should be forced to come along (rather than | respecting and supporting standards). I have a Chromebook that | stopped receiving updates some years ago (thank you for your | very safe and sustainable product Google!), his heuristic would | litteraly block me. | ReptileMan wrote: | Doesn't your chrome app updates? Never used chromebook. Just | asking. | phreack wrote: | Even if that were the case (which we can debate), him being | wrong before does not prevent him from being right now. Being | de facto banned from the common internet due to centralization | is absolutely scary. | ranger_danger wrote: | superkuh wrote: | I completely agree. I am against Cloudflare and the | centralization it implies 100%. I never use it for sites I | develop. | | I just have no sympathy for Daniel since up until just now he | was trying to get everyone to do this. | scarface74 wrote: | CloudFlare allows website host to have much finer grain | control that would have solved many of these problems - _if | they pay for it_. I see no problem with this. | dmix wrote: | The hosts aren't blocking him though, it's Cloudflare. | | > Just about every website I visited from my home | internet connection would result in a challenge page. | scarface74 wrote: | Cloudflare is blocking him because the hosts didn't | configure Cloudflare to not use captcha for sub domains | that host non browser traffic like podcast RSS feeds. | That was his theory. | | That capability is only available for paid CloudFlare | plans. | bashinator wrote: | It's almost as though sufficiently large communications | providers should be regulated as utilities. | daenney wrote: | Burn the witch! | | Lets read through that page for a second though: | Drop support for obsolete HTTP versions | | Doesn't seem like that's going to cause much issue for any | legitimate client from the past 10-20 years. He only recommends | blocking HTTP 0.9/1.0, which fair enough Append | a #hash to the form's action URL | | Hah. Clever man. I don't see how this is going to stop any | legitimate user from loading your website or submitting the | form, but I can see how it might frustrate bots. | Include a hidden prefilled form field | | This is just standard practice to mitigate CSRF. | Verify the Host and Origin request headers | | Yes. You should be doing that. Set a test | cookie and verify it gets included in the submission | | Another CSRF trick. Swap the name attributes in | the name and email fields | | This one's a little user hostile to folks who use assistive | devices like screen readers. But still won't prevent you from | accessing the site in the first place. Verify | the POST/Redirect/GET (PRG) chain | | As noted by the author, might cause some issues but again, | won't stop anyone from loading your website. | Block ancient versions of common browsers | | Alright please just don't do this. UA blocking is gross and | might prevent access through specialist software. But he also | calls this out himself. I strongly discourage | you from blocking or discriminating against unknown or uncommon | browser User-Agent request headers | | All in all, with the exception of UA blocking I don't see how | any of these mitigations would result in users not being able | to access said website, or having their loading times | drastically increased. | d2wa wrote: | >> Verify the Host and Origin request headers > > Yes. You | should be doing that. | | (Author here.) If I remember correctly, his browser of choice | predates the Origin header. | daenney wrote: | Alright well fair enough. Looks like that's only been | supported since Fx 70 released somewhere in 2019. So maybe | don't do that depending on what you intend to block. But | then again it's been 3 years also. | | In general though the whole tone of parent of "I am owed | access to someone else's computer system on my and my terms | alone" just doesn't jive with me. It's also not remotely | comparable to Cloudflare's approach of sitting in the | middle snd then appropriating end-user compute resources | without their consent to fuel their business. | ceejayoz wrote: | > This one's a little user hostile to folks who use assistive | devices like screen readers. | | As long as you're using a <label> or aria-label attribute, | that shouldn't be an issue. | d2wa wrote: | (Author here.) I am. There's plenty of accessibility labels | in place. It's literally just the name attributes. No user | ever sees this, whether they're using accessive | technologies or not. It only confused bots that assumes | that the field named email is for the email address. | nijave wrote: | All that stuff is easily defeated by automated browsers | anyway (i.e. selenium) | mh- wrote: | Yes, but those automated browsers are much more expensive | to operate than simple HTTP clients _pretending_ to be | browsers. | | It's an arms race/defense-in-depth situation. If someone | truly wants to automate your site in a _targeted_ fashion, | and it 's profitable for them to do so, you'll have to | invest a lot more in stopping it (and decide how much of it | is _worth_ stopping). | Aperocky wrote: | Even youtube fails with yt-dlp going as far as a internal | python file that parses javascript and execute them. | LinuxBender wrote: | He's quite tame compared to me I suppose. I block anything | that is not HTTP/2.0 which _currently_ knocks out all the | bots and all crawlers except Bing. But I just have hobby | sites these days. Nobody would notice or care if my sites | went offline. | | Using NGinx as an example: if | ($server_protocol != HTTP/2.0) { return 403 'Nope'; } | | Another thing I have found useful to drops some bots is to | become invisible to them. Many of the poorly written scanning | tools do not properly set MSS for reasons I still don't | understand. I use this to my advantage. | | Using IPTables as an example: | /sbin/iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp | --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 420:16384 | -j DROP | | Any TCP packets setting a very low or high MSS or missing MSS | will be silently dropped. I drop about 35K packets per host | per day on average. This also drops hping3 floods. | toast0 wrote: | > Many of the poorly written scanning tools do not properly | set MSS for reasons I still don't understand. | | MSS issues attract me like a moth to flame [1], so let me | ask some questions. | | It looks like this is dropping syns with MSS over 16384??? | That is indeed a pretty crazy high number. 9000ish seems | reasonable for someone on a jumbo network without a mss | clamping router, but above that is someone weird for sure. | | Under 420 seems unlikely too, but technically acceptable, | but sure, I'd drop it. In theory, a proper OS will send | several SYNs with MSS, then assume your server doesn't | support TCP options and send you a SYN with no options. | Going to take a while, but if someone legitimately has a | mss less than 536, their internet is probably pretty junky | anyway, so ok, seems fine. | | [1] I just built a browser based pmtud test site, | http://pmtud.enslaves.us/ | LinuxBender wrote: | You are right. I just happen to use a very safe range. If | I didn't care about anyone using jumbo frames I could set | the range to 1220:1536 and nearly all legit traffic would | pass just fine. 1220 (to 13xx) for the people using VPN's | and ip6-ip4 gateways. I just try to give really | conservative examples so that it is less likely I break | someones unusual setup. Anything just over 9k is fine for | most jumbo-frame setups. | | All of this said, I could set the range to 1:65536 and it | would still drop most bots as they don't even bother to | set MSS at all in their scans. I'm not sure which tool | they are using. | judge2020 wrote: | IMO blocking bots isn't too big of a concern, the problem | is when a dedicated attacker realizes you serve valuable | data (in your HTML). Next thing you know, they're running | puppeteer or a similar remote controlled browser to scrape | your site, which is both undesirable in itself and the | scraper might overload your site/database by scraping with | no internal parallel request limit. If you're not a startup | with an unlimited early cloud budget, it can be costly if | you want to handle both bot usage (including official API- | based or scraping bots) and regular users. | ceejayoz wrote: | The techniques described in that article are pretty reasonable | and shouldn't significantly impact users - swapping name/email | fields' names won't do a thing to you. There's also a | difference between "this one website doesn't work for me" and | "I've been blocked from half the Internet". | IshKebab wrote: | To be fair there's a difference between doing it for one site, | and doing it for a significant portion of the internet. | phantom_of_cato wrote: | Throwing an ad hominem is not cool. | [deleted] | jamespo wrote: | None of those techniques affect normal browsing | ufmace wrote: | I just read it, and I don't see any contradiction here. IMO, | he's recommending simple and direct anti-bot methods to web | admins specifically because it's better than relying solely on | Cloudflare etc for all bot blocking. He never recommends making | un-appealable access control decisions based on third-party | lists, and specifically recommends caution on methods that | might potentially impact innocent users. Seems perfectly | consistent to me. | ldoughty wrote: | I don't know the author or his reputation, but his suggestions | that you linked are (in my opinion) standard actions for any | dev/server admin getting spammed by their forms... And the | suggestions really only impact malicious actors accessing your | website from a script... Virtually none of those would be an | issue for any browser made in the last 15-20 years, or headless | browsers, but would break rudimentary scripts like entry level | hackers/spammers might use. | | He also specifically called out CAPTCHA as user-hostile. | superkuh wrote: | I guess like ctrl.blog you can't grasp the significance of | the issue until it happens to you. My firefox fork is | definitely blocked by his algorithmic "bot" detector. Just | because your browser isn't doesn't mean it only blocks bots. | | False positives happen. They happen a lot more than you | think. And they are a serious problem. Even more serious when | it's cloudflare, but arguing for everyone to implement these | algorithmic blocks "that won't inconvenience users" | individually, taken to it's logical end, does the same. | [deleted] | ldoughty wrote: | I don't see the reason for the personal attack. | | The blog post also calls out that you should not block | based on user agent. | | If a form post didn't respect the action property having a | #, that name/email HTML names might be reversed (whole the | type is correct, and the user displayed values are | correct), or include hidden HTML form fields that have been | standard since ~97? Back when I made my first few websites, | I certainly would agree that they are likely bots. | | Again, apparently this person has some hateful following, | but I don't appreciate you limping me into this hatred for | agreeing with his statements on this one particular issue. | superkuh wrote: | You said, "And the suggestions really only impact | malicious actors accessing your website from a script." | and that was false. Since you didn't have experience | being blocked you couldn't know. Not till it happens to | you. I don't think pointing this out is a personal | attack. It's just the way people work. People don't | believe things are a problem until they become a problem | for them. | | You and others can keep quoting the legit and clever ways | to mitigate bot spam but if you ignore the false | positives the other checks create it kind of defeats the | point. | scarface74 wrote: | > I strongly discourage you from blocking or discriminating | against unknown or uncommon browser User-Agent request headers. | The web is weird and we as developers shouldn't discourage it. | [deleted] | Melatonic wrote: | As much as I like Cloudflare now this is why long term monopolies | (not saying they are now) are bad | bastardoperator wrote: | Has he tried unplugging the router for 15 minutes and plugging it | back in? I jest but I know Comcast and Spectrum will both issue a | new IP address in that timeframe. | d2wa wrote: | (Author.) My ISP only rotates IPs when they reboot their | central equipment. Not enough to do it on my end. | bornfreddy wrote: | With some ISPs, they will issue a new IP if you change the | router's (WAN) MAC address. Might be worth a try next time | (crossing fingers you don't need it). | bastardoperator wrote: | This is what I've always seen too. I've never seen a | residential ISP that allocates static DHCP addresses, they | typically allocate in days which is why many people can | maintain a leased address for months on end. Once you go | offline though, all bets are off. Every ISP can determine | if the subscriber is disconnected and if they are, they're | going to reallocate your address. To your point, once the | MAC address is changed, they have to issue a new IP address | because using the logic posted above, the other address is | allocated to a different MAC. | dmix wrote: | IP bans by modern services like CF can't be solved that easily | in my experience. | bastardoperator wrote: | Clearly CF has a crystal ball /s. | | Once the IP address I don't own is released and assigned to | some other router how do you think CF determines the new IP | address for the individual/home? Unless this person is | running the CF Dynamic DNS service which gives CF the IP | address, I'm not sure CF would have any reasonable validation | techniques to determine who is what given the size of | residential networks. | aendruk wrote: | Cookie on their validation page? Browser fingerprint | hopping IPs in the same block? | dmix wrote: | Bingo | bastardoperator wrote: | So i've turned cookies off and switched to my ipad to | browse the internet for the evening, they have no | fingerprint, and no cookie... now what? | dmix wrote: | Are you on a different IP block? ISPs sometimes just | switch the last number. | | I had to use a VPN (a whole new IP) and clean chrome | install to bypass one those "IP blocks" which was | combined with fingerprinting. | bbu wrote: | I think cloudflare updated their bot detection algorithms because | we had multiple customers who complained that they get | challenged. I verified that they got a bot score of 1. As usual, | CF support is not that helpful... | synthetigram wrote: | Reputation systems should be based on /abuse/, not on automation. | I also ended up on the naughty list for running an archival | scraping program. Trying to preserve part of the Internet is | apparently against the rules. It's really a shame because my code | honors rate limits, doesn't spam, and is completely docile. | socialismisok wrote: | Is it plausible some ISP shared some IP address that was on | Cloudflare's list of suspicious IPs, or that some IoT device on | this person's network created a burst of suspicious traffic? | | I get that this sucks for the end user, but I wonder how much we | should blame Cloudflare vs the wider systemic challenges of | managing DDOS protection on the web. | laxis96 wrote: | I believe that might happen, but then I also believe it's the | ISP's responsibility to ensure that its IP addresses are kept | clean | socialismisok wrote: | For sure, the point I'm making is that there's a multi party | transaction here, with systemic complexity. Makes it hard to | pin responsibility on just Cloudflare (or just the user or | just the ISP, etc). | yjftsjthsd-h wrote: | Cloudflare is the one blocking a user based on things that | aren't their fault; I'm happy to blame them. | socialismisok wrote: | That's fine, but you are ignoring the broader picture if | you do. You've correctly identified a detail, but haven't | placed that detail in context. | yjftsjthsd-h wrote: | I'm not ignoring the context, I'm saying that it's | irrelevant. Cloudflare made the choice to block real | people based on factors outside of their control, and | then to market that product as a panacea; they don't get | to pass the buck, doubly so when they don't expose enough | information to let other people fix the things they | broke. | kazinator wrote: | > _For whatever reason, I must have done something that angered | Cloudflare_ | | I'm guessing: having an IP address close to (or outright reused | from and thus identical to) someone malicious, whom you know | nothing about. ___________________________________________________________________ (page generated 2022-09-20 23:01 UTC)