hngopher.com

       [HN Gopher] You don't want to be on Cloudflare's naughty list
       ___________________________________________________________________
        
       You don't want to be on Cloudflare's naughty list
        
       Author : merlinscholz
       Score  : 349 points
       Date   : 2022-09-20 14:26 UTC (8 hours ago)
        
 (HTM) web link (www.ctrl.blog)
 (TXT) w3m dump (www.ctrl.blog)
        
       | yamtaddle wrote:
       | Harsh blocking/limiting/challenging is way too valuable to sites
       | that are actually trying to make money online. It's not going
       | away short of legislation banning it. Losing 1/10,000 legitimate
       | customers to cut fraud attempts, spam, exploit attempts, and so
       | on, by 90% or more, is just too good a trade-off.
       | 
       | I have bad news about the most-likely fix for it, longer term, so
       | we can lay off the IP-based reputation stuff and the geo-
       | blocking: it's tying some form of personal ID to your browsing
       | activity, so _that_ bears the reputation instead of the address.
       | 
       | Sorry. Said it was bad news.
        
         | Waterluvian wrote:
         | I think this is true. It also reminds me of one possible
         | purpose of regulation and government, given the majority will
         | usually be happy to throw any sort of minority under the bus
         | for the "greater good."
         | 
         | This also reminds me of the anxiety of Google deciding to just
         | ban my account for some reason. They can't be bothered to
         | commit resources to making sure mistakes can be resolved. They
         | don't care to lose a fleetingly small percentage of customers.
         | 
         | Not sure I have an answer. Just a thought.
        
         | akira2501 wrote:
         | > Harsh blocking/limiting/challenging is way too valuable to
         | sites that are actually trying to make money online.
         | 
         | I'm not understanding the generalized sentiment here. How
         | would, for example, a retailer benefit from this strategy? How
         | does it protect their bottom line?
         | 
         | I can see how a particular kind of "facilitated user economy,"
         | such as games, gambling and promotional companies could
         | benefit, but it doesn't seem that broadly applicable to what
         | most people would consider a "mainstream" business.
         | 
         | > so we can lay off the IP-based reputation stuff and the geo-
         | blocking: it's tying some form of personal ID to your browsing
         | activity
         | 
         | And a new market for identity theft is born.
         | 
         | Also, as someone who serves content and geo blocks it, that's
         | not up to me, that's up to the owner of the content or whoever
         | happens to be licensing it for them. So, even if you sent me a
         | picture of your government ID, it changes nothing.
        
           | les_diabolique wrote:
           | > a retailer benefit from this strategy? How does it protect
           | their bottom line?
           | 
           | A couple of examples I can think of is blocking bots from
           | scraping their site for pricing and details and from
           | resellers from buying up all of the stock (see sneakers,
           | electronics, etc). The last example doesn't directly impact
           | their bottom line, but it will make customers go elsewhere.
        
           | yamtaddle wrote:
           | > I'm not understanding the generalized sentiment here. How
           | would, for example, a retailer benefit from this strategy?
           | How does it protect their bottom line?
           | 
           | The amount of automated _and apparently-manual_ attempted
           | credit card fraud (and exploit attempts, for that matter) any
           | halfway-prominent site with a CC form is subjected to is hard
           | to appreciate if you 've never seen it. It's _a whole lot_.
           | They aren 't even necessarily trying to buy what you have,
           | but to validate that their stolen cards work. And they're
           | quite busy. If too much of that gets through--really, any
           | more than a _very_ tiny amount of it gets through--you 're
           | gonna have an extremely bad time.
           | 
           | Various CC service providers like Stripe do provide tools to
           | try to block those attempts, but defense in depth is usually
           | a very good idea, including fairly aggressive firewall-level
           | blocking.
        
         | hot_gril wrote:
         | The other not-so-great approach is to act like a normal user.
         | This stuff doesn't tend to happen to the average Joe who
         | browses the WWW. It's when you're doing unusual (albeit
         | harmless) things.
        
         | jabbany wrote:
         | An alternative that preserves some privacy also doesn't seem
         | that hard to imagine... though it probably has its own can of
         | worms*.
         | 
         | Basically, the core problem is digital identities (accounts,
         | IPs, phone #s etc.) are cheap to create (even considering
         | captchas and all) so fraud is easy. The solution could be just
         | to make it "costly" to create new digital identities. For
         | example, you could get a "verified but anonymous" identity
         | issued by locking some assets (could be real world money, or
         | maybe something intangible like community reputation) as
         | collateral with a trusted party (or, for the crypto people, the
         | blockchain). If you misbehave, you lose your reputation on that
         | identity (and essentially your collateral) and have to start
         | over. This lets anyone bootstrap a "minimal" level of trust at
         | the beginning before they can use time to prove themselves
         | trustworthy.
         | 
         | Note: This model might remind some of things like staking in
         | crypto. However the idea is really not anything new... Putting
         | money on the line is really how most low-trust bootstrapping
         | happens.
         | 
         | *: To name a few:(1) this can result in participation being
         | gated by wealth, which can be unfair. (2) it makes accounts
         | more valuable to hack so people need better security practices
         | [re: twitter checkmark]. (3) one would need some authority to
         | decide how accounts lose their collateral or maybe the
         | collateral is just burned to create that initial credibility...
        
           | mhink wrote:
           | > Basically, the core problem is digital identities
           | (accounts, IPs, phone #s etc.) are cheap to create (even
           | considering captchas and all) so fraud is easy. The solution
           | could be just to make it "costly" to create new digital
           | identities. For example, you could get a "verified but
           | anonymous" identity issued by locking some assets (could be
           | real world money, or maybe something intangible like
           | community reputation) as collateral with a trusted party (or,
           | for the crypto people, the blockchain). If you misbehave, you
           | lose your reputation on that identity (and essentially your
           | collateral) and have to start over. This lets anyone
           | bootstrap a "minimal" level of trust at the beginning before
           | they can use time to prove themselves trustworthy.
           | 
           | I've always thought that client certs would be an interesting
           | solution to this problem. Any given certificate can carry
           | signatures from multiple signing authorities, right? So we
           | could imagine a world where there are many different
           | certificate authorities, each of whom have their own criteria
           | for signing a particular certificate and each of whom offer
           | different varieties of assurance regarding the signature-
           | holder's identity.
           | 
           | From here, the question of "should I allow the user
           | identified by this client cert to use my service" simply
           | becomes a question of 1.) checking the validity of the
           | signatures of the client cert and 2.) deciding if the CA's
           | criteria for signing certs aligns with my desired userbase.
           | 
           | For example, a particular CA might insist that their users go
           | through some real-world process to renew their certification
           | every few years, but when they sign a cert it means that the
           | bearer has been strongly vetted as a real person.
           | 
           | An interesting side effect of this auth model is that a
           | service provider accepting certs from a particular CA has
           | someone to complain to if a user bearing their signature acts
           | improperly on their platform. You could imagine a CA which
           | has a code of conduct expected of the users whose certs they
           | sign, and would perhaps revoke a user's certification if too
           | many websites complain.
        
             | unwise-exe wrote:
             | That's not safe for a lot of sites, though.
             | 
             | I hear that porn tends to be officially frowned on in a
             | fair number of places.
             | 
             | Reading non-approved news is dangerous in some places.
             | 
             | Honestly _debating_ political topics can be super dangerous
             | if you 're identifiable.
             | 
             | Sometimes even having a login on a site is dangerous, I
             | think I heard about this after a non-mainstream discussion
             | site got hacked like a hear and a half ago.
        
           | georgyo wrote:
           | Your idea is comes from a good place, but identity theft is
           | already a thing in the real world. Digital identities would
           | also be very stealable. This malware more harmful in the long
           | term. Imagine if your Twitter gets hacked and your digital
           | identity makes it so your Gmail gets blocked.
           | 
           | Similar, the internet is already very difficult for the
           | people with limited means. This would make it even harder.
        
         | [deleted]
        
         | smsm42 wrote:
         | They are already testing out digital IDs. Now link that to the
         | social score... and make the browsers and the sites exchange
         | these data on the background, and make frontend services
         | providers refuse connections from non-supporting browsers as
         | "bots"...
        
         | tboyd47 wrote:
         | How does having a personal ID tied to browsing activity help
         | with spam? Are spammers not real people with IDs?
        
           | les_diabolique wrote:
           | Spammers typically implement bots to carry out tasks. I mean,
           | technically at some point a spammer is a real person, but
           | when you're automating tasks and using bots, it's not at the
           | same scale.
        
             | notsapiensatall wrote:
             | So what happens when your ID gets hacked and reused for
             | fraudulent activity?
             | 
             | Would you have to submit a dispute with the internet credit
             | agencies? Maybe join a class action suit against the entity
             | that leaked your ID so that they're forced to give you a
             | year of free internet identity monitoring?
        
               | jamie_ca wrote:
               | Then you need to deal with levels of rate-limiting that
               | are fine for individuals but make it not feasible for
               | spammers.
               | 
               | Keeping with the cloudflare topic, if Cloudflare only
               | permits you 10 requests per second (HTML + JS/images)
               | that's still usable for web browsing, but someone running
               | a cloud of hundreds of bots would be effectively shut
               | down. Similarly with email, an individual probably
               | doesn't need to send more than one email per 10 seconds
               | but email spammers wouldn't find any ROI at that rate -
               | business needs being different might necessitate a
               | different registry or something in that case.
        
               | smsm42 wrote:
               | The same that happens now when somebody stills your
               | identity and ruins your credit history. You'll have to
               | live in a bureaucratic hell for the next couple of years.
               | And yes, as a compensation, you'll get the $6.99 worth of
               | services from the guilty party. If you win the class
               | action suit, that is.
        
               | notsapiensatall wrote:
               | Exactly. Why on earth would we want to replicate such a
               | terrible system online?
               | 
               | We should be reforming our current credit agency system,
               | not empowering it with a new mandate of judging
               | somebody's social or political creditworthiness.
        
               | mcguire wrote:
               | Nobody said it wouldn't suck. The only question is
               | whether it sucks less than the alternatives.
        
           | adamckay wrote:
           | Of course, but the theory is it's restricting 1 real person
           | to 1 account, versus 1 spammer creating 1,000 accounts via
           | automation.
           | 
           | And once your spammer has been identified then that's them
           | banned/removed, unable to sign up again.
        
             | tboyd47 wrote:
             | What's to stop them from using fake IDs
        
       | thayne wrote:
       | I haven't experienced it as badly as the author. But I find the
       | cloudflare page checking that I am using a "secure browser" very
       | frustrating. I seem to get it the most for gitlab pages for some
       | reason.
        
       | marcus_holmes wrote:
       | I use a VPN, for perfectly legitimate reasons (I travel a lot,
       | and most internet services assume that your IP address also
       | indicates your nationality, citizenship, language, bank account
       | country, etc. Being able to change IP source country is vital).
       | 
       | Some VPN exit addresses have obviously been flagged as "bad" by
       | Cloudflare and I get challenged with CAPTCHAs from some
       | countries. It's an interesting experience, but luckily my VPN
       | provider has enough exits that I can usually switch to one that
       | has better reputation with Cloudflare.
       | 
       | Obviously, none of this is helping the internet be a better place
       | from my point of view. I get that it's part of the ongoing fight
       | against bots and spam, but it always feels so arbitrary. IP
       | addresses are interchangeable, folks - they say nothing about the
       | nature of the request. Or rather, for a large majority they do,
       | but there's us minority that don't obey those rules and resent
       | getting caught up in it.
        
       | jasonlotito wrote:
       | Yeah, this just continues to reinforce my opinion Cloudflare.
       | It's not something I would ever recommend, and there are numerous
       | other superior options out there. I see Cloudflare failing
       | frequently enough that if it were something I was responsible
       | for, I'd be embarrassed at the very least.
        
         | tire-fire wrote:
         | What superior options would you recommend that are privacy
         | focused and free?
        
         | dedward wrote:
         | I'm curious if you've had experience with their enterprise
         | package?
         | 
         | I can understand people's gripes about things on the free/cheap
         | packages, where Cloudflare makes decisions for you, sometimes
         | ones you don't like.
         | 
         | But as an enterprise customer, I've never found it to be
         | anything short of fantastic - I can tailor it to behave exactly
         | how I want, and not interfere with my customers.
        
           | johnklos wrote:
           | Your response seems to ignore the very article being
           | discussed.
           | 
           | Or are you suggesting that if you're having trouble visiting
           | sites because of Cloudflare, you should become an enterprise
           | customer? (slightly sarcastic, but not completely)
        
             | dedward wrote:
             | My response is simply trying to understand where you are
             | coming from. You've mentioned there are numerous superior
             | options and you would never recommend it.
             | 
             | I'm wondering (genuinely!) if you are speaking as an
             | enterprise customer or a free plan, or what.... both for
             | the sake of meaningful discussion and potentially learning
             | about even better options for my own work.
             | 
             | As to the article - I fully believe the responsibility lies
             | with site owners to pick and choose how they want to serve
             | their sites. Nobody is forcing them to use Cloudflare on a
             | free plan, or to ignore any analytics it provides and make
             | sure it is serving their customers correctly. Cloudflare is
             | one piece of a delivery solution, and only works as well as
             | you configure it. If your decision for your app is "I'll
             | just use the free plan, and let Cloudflare decide
             | everything for me" then you get what you pay for.
             | 
             | If Cloudflare is getting in their way, they can go
             | somewhere else.
        
       | DethNinja wrote:
       | There is a chance you might've been hacked.
       | 
       | You would be surprised to see how easy it is to hack domestic
       | routers.
       | 
       | 1. Find and disinfect the devices, including the router. If you
       | don't have enough technical knowledge, then buy a new router.
       | 
       | 2. Use 30 character long random password on the router.
       | 
       | 3. Disable UPnP.
       | 
       | 4. Anything with WI-FI and weak password can be hacked within
       | minutes, so check your other devices as well, especially IOT
       | ones.
        
         | mh- wrote:
         | My assumption is also that something on his network is
         | compromised, and getting his IP into reputation issues.
         | 
         | Tarpitting (serving content slowly from the edge, in order to
         | slow down bots) is necessarily one of the most expensive tools
         | in a WAF/CDN's toolbox.
         | 
         | It's _much_ more likely that something on his network is
         | sending sketchy traffic to CF-fronted /Google sites, and the
         | slow loading he's experiencing elsewhere is because his
         | upstream is being saturated by whatever is happening on his
         | network.
        
         | d2wa wrote:
         | (Author here.) My router isn't a domestic router. It's a
         | MikroTik running RouterOS, completely unsupported by the ISP.
         | Outgoing connections and DNS is logged. UPnP is only allowed
         | for the Xbox, PS4, and off-most-of-the-time gaming PC. Nothing
         | out of the ordinary in the logs.
        
           | alexforster wrote:
           | > It's a MikroTik running RouterOS
           | 
           | https://google.com/search?q=mikrotik+botnet
           | 
           | These things are the absolute scourge of the internet.
        
           | aaronmdjones wrote:
           | > It's a MikroTik running RouterOS
           | 
           | It's almost certainly compromised.
        
         | malfist wrote:
         | Why would you disable UPnP? You're gonna break most
         | collaboration tools/video games/etc.
        
           | kunwon1 wrote:
           | Disabling UPnP doesn't break much. I've used enterprise
           | firewalls at home for years, none of them have UPnP, I've
           | never noticed a problem arising from that lack. I don't have
           | a problem with video games or collaboration tools
           | 
           | UPnP allows devices inside your network to open ports to the
           | outside world without your knowledge. I think everyone should
           | avoid it if they can get by without it
        
             | d2wa wrote:
             | It's absolutely required for most multiplayer games. Many
             | need random ports and some even refuse to work if UPnP is
             | blocked even if you manually open a port for them.
        
               | aaronmdjones wrote:
               | I've never had UPnP enabled and I don't have any problems
               | doing online gaming / flight sim / video chatting / etc.
        
             | malfist wrote:
             | What's your solution for the grandmother who just wants to
             | make a zoom call to her grandson? Have her log into her
             | router portal and setup a static ip for her laptop and then
             | port forwarding routes for zoom?
        
               | Karrot_Kream wrote:
               | STUN servers? Also, while I (not GP) do think UPnP is
               | dangerous, I also think it's only something you disable
               | if you _know_ you can live without.
        
               | thayne wrote:
               | I don't think zoom uses UPnP. If it did, that would cause
               | problems on corporate networks that typically have UPnP
               | disabled.
        
           | [deleted]
        
           | zinekeller wrote:
           | To be frank, that's exactly the problem with NAT-PMP et al.
           | assuming that there's no router bugs: the ability to forward
           | ports has been abused to set up bot relays on hacked IoT
           | devices. This is why I predict that even in IPv6 era we would
           | still have to rely on a TURN-equivalent.
        
             | malfist wrote:
             | That's exactly the problem with NAT-PMP?
             | 
             | So what's your alternative for peer to peer connections?
             | Static routing that the common end user can't figure out?
             | Re-centralize connections?
             | 
             | UPnP is necessary.
        
               | zinekeller wrote:
               | I'm simply pointing the problem, a real-world an
               | realistic problem, and you're acting like it's a non-
               | issue. Point me a CGNATted network which has enable port
               | forwarding. Does it break a lot of things? Oh,
               | absolutely. Did the carriers still not activated it? Yes.
               | Automatic port forwarding is only beautiful when you know
               | how would your device react. It's ugly when you're a
               | network administrator who don't control all devices.
               | 
               | There is no "perfect" solution here because the real
               | world is a messy place with devices that you cannot
               | personally vouch for.
        
       | nuc1e0n wrote:
       | This story shows Cloudflare is now harming legitimate users, is
       | an effective monopoly and as such should be broken up.
        
       | jabroni_salad wrote:
       | Do you have an ISP-provided email account that you never check?
       | You might want to check it to see if you have any botnet
       | notifications.
        
       | simple-thoughts wrote:
       | There's a real lack of education I've seen in developers for
       | small projects who go directly to cloudflare for anything and
       | everything. They don't understand that they are immediately
       | losing a large chunk of their user base who is either from the
       | third world or is privacy literate. Devs working on projects that
       | are targeting those groups need to understand the tradeoffs from
       | using cloudflare.
        
       | shiomiru wrote:
       | If you'd like to experience this treatment first-hand, try
       | surfing the web using the Tor Browser.
       | 
       | Spoiler alert: many websites simply refuse to load at all (e.g.
       | any google service, and lots of websites "protected" by CF).
       | Captchas are everywhere: in many cases, you can't even complete
       | simple GETs of blogs without donating free labor to CF.
       | 
       | And the most infuriating part, you get CF marketing messages
       | right in your face while your browser is calculating hashcash (I
       | guess?)... At this point I can recognize every single one of
       | them: something about bots making up 40% of all internet traffic,
       | something about their web scraper protection racket, something
       | about small businesses (???), etc etc...
       | 
       | To be fair, Tor exit nodes have an awful reputation for sure.
       | Nevertheless, I have a hard time forgiving how CF makes browsing
       | the Internet hell for those who actually need Tor.
        
         | yjftsjthsd-h wrote:
         | > And the most infuriating part, you get CF marketing messages
         | right in your face while your browser is calculating hashcash
         | (I guess?)... At this point I can recognize every single one of
         | them: something about bots making up 40% of all internet
         | traffic,
         | 
         | Yeah, there's something amazingly aggravating about CF telling
         | you how much traffic is bots _while showing that they can 't
         | distinguish you from a bot_.
        
           | robocat wrote:
           | CloudFlare are creating a new devision for advertising to
           | bots. They have projected that in the near future, bots will
           | be 90% of spending, so the bot demographic is the most
           | important to target, marketingwise.
           | 
           | The fact that humans are seeing the traffic meant for bots is
           | an unfortunate side-effect.
           | 
           | I personally welcome our future bot overlords (not only
           | because being unwelcome might be unhealthy for me -- why
           | would I publicly disagree with an overlord or not want to be
           | their friend?).
        
         | jasonfarnon wrote:
         | I routinely use Youtube with Tor. I will occasionally get
         | kicked off with a "suspicious traffic" message, but it isn't my
         | experience that it "refuses to load at all".
        
         | synthetigram wrote:
         | Cloudflare has mixed up the definitions of "bot" and "abuse".
         | Tor users may or may not be bots, but as long as they don't
         | abuse (spamming or DoS), they ought to be treated the same.
        
       | sampa wrote:
       | If an ordinary user would have to deal with google/CF bs everyday
       | as I do, they'd burn their computer.
       | 
       | PS Proud user of Firefox + resistFingerprinting=true PPS Ain't
       | nothing better than CF guard page constantly-reloading on 20% of
       | sites if you open some url :( No, fella, you first have to open
       | the root '/' page so that guard page finally can either pass me
       | through or show the cloudflare captcha. Ugh. Progress, they say.
        
       | mikessoft_gmail wrote:
        
       | johnklos wrote:
       | Imagine all the people in countries deemed less desirable by
       | Cloudflare that go through this all the time. Cloudflare, whether
       | it's their stated goal or not, is re-stratifying and re-
       | centralizing the Internet because of their desire to be a
       | monopoly, and we'll all suffer as a result.
        
         | andrewnyr wrote:
         | there are multiple other large CDNs out there... its a lot more
         | like 5 market leaders tbh
        
           | johnklos wrote:
           | But how many of them:
           | 
           | 1) refuse to take responsibility for content they host by
           | claiming they don't host
           | 
           | 2) discriminate against huge parts of the Internet with no
           | publicly known rules, nor methods to change that
           | discrimination
           | 
           | 3) make the abuse reporting process intentionally difficult
           | and time-consuming
           | 
           | 4) want to aggregate all the DNS data they can by making a
           | deal with Firefox to turn on DNS-over-https by default
           | without asking or even informing end users
           | 
           | 5) want to re-centralize the Internet, in part so they can
           | mix bad actors with good, in ways that make blocking next to
           | impossible
           | 
           | How many of them do the discrimination we're all writing
           | about here?
        
             | easrng wrote:
             | tbh I think one of the very few positives of having so many
             | sites going through a few CDNs is that you can make it
             | impossible to block a protocol or site without significant
             | collateral damage, which can be a good thing, things like
             | Tor's meek bridge rely on that.
        
             | andrewnyr wrote:
             | 1) refuse to take responsibility for content they host by
             | claiming they don't host >CDNs don't host content, they
             | proxy it
             | 
             | 2) discriminate against huge parts of the Internet with no
             | publicly known rules, nor methods to change that
             | discrimination >Not large parts of the internet, scammy and
             | attacky parts of the internet. If the rules were public
             | they wouldn't be effective.
             | 
             | 3) make the abuse reporting process intentionally difficult
             | and time-consuming >simply untrue, every abuse report i
             | have filed has had an answer back within 24hrs
             | 
             | 4) want to aggregate all the DNS data they can by making a
             | deal with Firefox to turn on DNS-over-https by default
             | without asking or even informing end users >this is a good
             | thing as they are audited as having not keeping logs of dns
             | queries
             | 
             | 5) want to re-centralize the Internet, in part so they can
             | mix bad actors with good, in ways that make blocking next
             | to impossible >again every cdn centralizes the internet,
             | and many sites need this protection
        
       | PaulHoule wrote:
       | The rise of Cloudflare is the first real threat I've seen to
       | ordinary people running webcrawlers.
        
         | mschuster91 wrote:
         | Tragedy of the commons, unfortunately. There were a bunch of
         | cases where web crawlers and scrapers built competitive
         | services on the back of the services they scraped, some of
         | these ending up in courts [1].
         | 
         | [1] https://www.derstandard.at/story/1389860104020/eu-
         | gerichtsho...
        
         | lorey wrote:
         | Which is in turn a threat to the open web in general. Could not
         | agree more.
        
       | adamsb6 wrote:
       | Does the author have a fixed IP?
       | 
       | If not, figure out how to get a new one and see if the blocking
       | recurs. If it does, the bad activity is probably coming from
       | inside the house -- or CloudFlare has a way to identify you
       | across an IP change.
        
         | d2wa wrote:
         | The author, me, does have a dynamic IP, but it only changes
         | once every two years or so.
        
       | digitailor wrote:
       | Not saying that this is the case here, but this may be possible
       | due to having a bad tab open. Especially over cellular. Haven't
       | looked into it with any depth, but I've had correlations on a
       | much shorter timeframe. Suddenly, CloudFlare and/or Google start
       | questioning my humanity, so I close all tabs. Then okay. Sloppy
       | hypothesis with no evidence: JS gone haywire
        
       | rubyist5eva wrote:
       | Cloudflare is on _my_ naughty list. I actively advocate against
       | people using them.
        
       | robjan wrote:
       | I have two dedicated home internet IPs (one iCable fibre and a
       | China Mobile 5G fallback/quarantine WiFi) and get these "checking
       | if your internet connection is secure" interstitials all the time
       | now. Also see them on my HKBN work connection.
       | 
       | I'm from Hong Kong and suspect the whole territory is on the
       | naughty list.
        
       | JCWasmx86 wrote:
       | Couldn't you use e.g. the DSGVO/GDPR in the EU to get all the
       | information about your IP, everything cloudflare has stored about
       | it until you find the root cause?
        
       | unity1001 wrote:
       | Its amazing how Cloudflare became another tech monopoly that can
       | decide the lives of ordinary people in a totally unregulated,
       | private fashion.
        
       | therealmarv wrote:
       | If you surf on desktop sites from Philippines on a mobile phone
       | plan (which is often the best Internet connection in that
       | country) you also get Cloudflare's captchas everywhere.
       | 
       | I told it before and tell it now again: Cloudflare is dividing
       | the World between first and second/third World countries with
       | their captchas. I call it discrimination of second/third World
       | countries! If you are from US and Europe you will never notice it
       | but if you travel a little bit more you see these blocking
       | captchas everywhere.
        
         | chrismorgan wrote:
         | I've had a similar experience in India with wired internet from
         | a local ISP: CGNAT is used so there are who knows how many
         | customers on the same IPv4 address,
         | https://iknowwhatyoudownload.com/ shows at least forty hours of
         | movies being downloaded every day, the IP address is on half
         | the blacklists out there because _someone_ is part of an email-
         | sending botnet, and yeah, Cloudflare hates you.
        
         | Jamie9912 wrote:
         | Maybe your mobile ISPs dont do enough to stop malicious/spam
         | traffic. That's not Cloudflare's fault
        
           | therealmarv wrote:
           | It only affects Cloudflare hosted sites though.
        
         | thewebcount wrote:
         | I get it browsing from a major ISP in the US. I have the gall
         | to browse in private mode and to block trackers and ads because
         | of all the malware they contain. (And I don't use a browser
         | that requires me to login just to browse the web - gasp!) And
         | apparently, that means I'm worthy of this sort of punishment as
         | well.
        
         | aendruk wrote:
         | The other side of this story is that PLDT stands out from other
         | residential networks as a persistent source of web form spam.
         | I'd love to learn what's going on differently there.
        
         | Dma54rhs wrote:
         | I get these a lot and I'm from EU. But it's "seasonal".
        
         | ReptileMan wrote:
         | I am from Europe and I notice if I use some non residential ip.
         | The captchas are extremely annoying especially when trying to
         | access a site I have already been logged into with 2fa. Who is
         | protected in this case.
        
       | cft wrote:
       | I actually think that Cloudflare is setting up the foundation of
       | Chinese style (but privately outsourced in the US case)
       | censorship machinery in the US. Between their AI erroneously
       | flexing its power, Kiwifarms scandal and similar, they are
       | emerging as a rival to Google in its censorship effort. One of
       | the most dangerous companies in the internet.
        
       | smsm42 wrote:
       | So this gets me thinking. We know Cloudflare will boot a site if
       | they really don't like them. Now, what happens if Cloudflare
       | doesn't like _you_? I mean, really really doesn 't like. Maybe,
       | you said something wrong online or participated in a wrong group
       | activity, or something like that. Is it the case that they have
       | the power to essentially deny you (provided you have a static IP
       | and don't use VPN, say) access to a major part of the Internet?
       | And you can do absolutely nothing about it?
       | 
       | I know they haven't done anything like that yet. But the
       | technical capability is there, and we all know how short is the
       | distance between technical capability and doing it, when the
       | appropriate pressure is applied. So I wonder, how long before
       | activists start demanding for CF to boot people from the
       | internet, and how long before CF caves in to that...
        
         | [deleted]
        
       | neurostimulant wrote:
       | If your ISP is using CGNAT, sooner or later you'll going to
       | experience this problem. When this happen, I had to use a VPN (I
       | use mullvad) to reduce the amount of cloudflare challenges I get.
       | Pretty funny because usually I got more challenges when using a
       | VPN instead of the other way around. The Privacy Pass extension
       | also seems to help a bit.
        
       | jgrahamc wrote:
       | _Well into the second day of Cloudflare's blockade of my home
       | internet connection, Google Search also began blocking requests.
       | It required me to resolve a CAPTCHA challenge for every other
       | search. This luckily only lasted a day._
       | 
       |  _Cloudflare shares IP reputation data with partners like Google,
       | coordinated through a program called the Bandwidth Alliance. So,
       | my original offense might not even have been against Cloudflare.
       | It might have received the reputation data from a partner, and it
       | just propagated through the Bandwidth Alliance network._
       | 
       | That's not what Bandwidth Alliance is at all. It's about reducing
       | or eliminating egress fees between a cloud provider and
       | Cloudflare. Not sure where the idea that it's about sharing IP
       | reputation data comes from.
       | 
       | https://www.cloudflare.com/bandwidth-alliance/
       | 
       | So, if Google Search started showing a CAPTCHA that's not
       | Cloudflare.
        
         | xani_ wrote:
         | > Not sure where the idea that it's about sharing IP reputation
         | data comes from.
         | 
         | Probably from scam called mail blacklists
        
         | [deleted]
        
         | plumeria wrote:
         | It is interesting that the Bandwidth Alliance partners list
         | shows pretty much every big cloud provider except AWS and
         | Akamai [0]
         | 
         | [0] https://www.cloudflare.com/bandwidth-alliance/
        
         | throwawayays wrote:
         | The tone of this reply is a bit shit from a PR perspective.
         | 
         | How about _also_ pointing to a knowledge base article for how
         | an end user could go about working out what network activity
         | from their IP might be flagging Cloudflare's systems?
        
         | phantom_of_cato wrote:
         | But that's beside the main point. You guys are essentially the
         | "single point of failure" for half the internet. [1] Being
         | competent and smart doesn't really help too much, as
         | demonstrated by how you guys had to give in to the pressure to
         | censor recently.
         | 
         | [1]: https://easydns.com/blog/2020/07/20/turns-out-half-the-
         | inter...
        
         | TakeBlaster16 wrote:
         | Can you acknowledge the main point of the article? What should
         | someone do if they find themselves misclassified by
         | Cloudflare's systems?
        
           | mh- wrote:
           | _(not the parent commenter)_
           | 
           | That person should start with the assumption they _haven 't
           | been misclassified_ and eliminate the possibility that a
           | device on their network is compromised.
        
             | JohnFen wrote:
             | A task that would be made much easier and less likely to
             | miss something if the affected person had some indication
             | as to what the problem was.
        
               | buildbot wrote:
               | Devil's advocate - would it not then be pretty easy to
               | engineer malicious bots to avoid detection?
        
             | d2wa wrote:
             | (Author here.) That's missing from the article. But I have
             | logs of the network. There's nothing out of the ordinary.
             | "I don't know what I did wrong," as I started the article,
             | means "I've checked logs and such and there's no indication
             | of anything wrong on my end."
        
         | tomxor wrote:
         | FYI, this guy is far from alone, your "protection" has given me
         | a lot of grief over the past few years, particularly on highly
         | NATed mobile networks.
         | 
         | I've been gradually removing cloudflare based CDNs from
         | services I develop and control because I don't want my users
         | being arbitrarily discriminated against.
         | 
         | There was a good article posted on HN recently titled "The
         | ideal level of fraud is non-zero" which I think is highly
         | relevant here... In essence any mechanism employed to prevent
         | illegitimate use comes with a negative cost to legitimate
         | users, if that cost is too high it defeats the purpose. i.e
         | what's the point in a website that is completely immune to a
         | botnet and also cannot be accessed by anyone else? unplugging
         | the ethernet cable also effectively protects against botnets.
         | More subtly the cost of outright rejecting some legitimate
         | users is usually not worth the savings of rejecting 100% of
         | illegitimate ones. I think Cloudflare's service has it the
         | wrong way around: it currently accept blocking legitimate users
         | far too easily, that is not an acceptable cost; whereas you
         | should be letting a higher level of bots through to avoid
         | pissing off legitimate users - if it's not obviously a DDoS,
         | it's probably worth the bandwidth cost.
         | 
         | Consider the bigger picture, if you save a slither of a penny
         | by blocking a bot, but also end up blocking or seriously
         | inconveniencing 10 real users... is it worth it.
        
           | dmix wrote:
           | Cloudflare just isn't worth the tradeoffs: the risks
           | associated with their centralization, how they made Tor
           | basically unusable on non-onion sites, the lack of
           | transparency when content-moderating the internet, etc.
           | 
           | The space is in need of solid competitors to break the
           | stranglehold they have on the internet. Whether it's the
           | right combination of services, documentation, etc.
        
             | thaumaturgy wrote:
             | Tor made Tor unusable on non-onion sites. I feed a
             | netfilters table with the list of exit node IPs that Tor
             | publishes (https://check.torproject.org/torbulkexitlist) as
             | a standard part of server deployment, and it's the single
             | most effective way to reduce form and login abuse on hosted
             | sites. I like the idea of Tor, but there's no denying that
             | it's a huge source of nuisances.
        
               | shaky-carrousel wrote:
               | I live in a country with censored internet. What you are
               | doing is harmful. I can only hope whatever you provide is
               | irrelevant enough.
        
               | thaumaturgy wrote:
               | I'm sorry. I have a colleague based out of Venezuela.
               | We've had to work together to get tunnels and vpns
               | configured so that he can get uncensored and secure
               | internet access.
               | 
               | But Tor is an enormous source of abusive traffic and if I
               | don't filter it, then that's harmful to site owners. I'm
               | being forced to choose between the needs of people that I
               | know, work with, and depend on financially, and the needs
               | of people in countries with issues that are far outside
               | my ability to resolve. It's not a hard decision.
        
               | justsomehnguy wrote:
               | > It's not a hard decision.
               | 
               | Depends on what you imply under 'hard'.
               | 
               | As a IaaS provider I endured alk the hurdles about that
               | and ten years later - I don't care, at least not until my
               | outbound bill is bigger than usual.
               | 
               | Like some of the clients are on CentOS6, on a public
               | facing machines.
        
               | parroteal wrote:
               | I'm a noob, can you give me a pointer?
               | 
               | What kind of abusive traffic is coming through Tor and
               | why do they do it?
        
               | remus wrote:
               | Say you're running an account take over script that spams
               | login forms with a list of known username and password
               | combos. If a website owner sees thousands of login
               | attempts coming from a single IP address they're likely
               | to block you to prevent abuse on their website. This is
               | annoying for you as you then need to rotate your IP
               | address.
               | 
               | Using tor hides your IP address from the website and
               | makes switching exit nodes very straightforward, so you
               | can run your account take over script in peace.
        
               | thaumaturgy wrote:
               | Mainly forms -- login forms, comment forms, signup forms.
               | Bots use Tor pretty heavily because it's anonymous and
               | hard to block them without blocking the entire network.
               | Login form abuse is mildly irritating but not a huge deal
               | if you have other measures in place. Comment spam is
               | annoying but there are some options that deal with it
               | pretty well.
               | 
               | But the signup spam was a headache. I didn't want to just
               | blackhole Tor traffic, and tried to reduce the abuse with
               | other tools, including some custom stuff. The final straw
               | was a customer's small business site that had a MailChimp
               | or Constant Contact signup form. Those vendors want you
               | to embed their code by default to render the form, so you
               | have less control over the form itself. There were
               | workarounds, but they all sucked.
               | 
               | Tor bots would sign up email addresses through this
               | newsletter form, and then I'd have to go through and
               | manually scrub them before newsletters went out, or the
               | service would penalize my client for too many
               | bounces/unsubscribes/complaints. Very nearly 100% of the
               | abuse on that particular form came from Tor IPs.
               | 
               | I do not want to spend my limited time on this Earth
               | manually sorting out bots from humans because of one
               | particular network. Blackholing Tor made that problem
               | disappear immediately.
               | 
               | VPNs are dime-a-dozen now, cheap VPSs are available from
               | lots of vendors, there's Wireguard, there's ssh, a clever
               | person could even set up Apache or nginx as a forward
               | proxy with ssl from LetsEncrypt. Tor is well over 90%
               | abusive traffic (https://blog.cloudflare.com/the-trouble-
               | with-tor/). This is a Tor problem, not a me problem.
               | There are better alternatives available.
        
               | judge2020 wrote:
               | https://blog.cloudflare.com/the-trouble-with-tor/
               | 
               | > . Based on data across the CloudFlare network, 94% of
               | requests that we see across the Tor network are per se
               | malicious. That doesn't mean they are visiting
               | controversial content, but instead that they are
               | automated requests designed to harm our customers. A
               | large percentage of the comment spam, vulnerability
               | scanning, ad click fraud, content scraping, and login
               | scanning comes via the Tor network. To give you some
               | sense, based on data from Project Honey Pot, 18% of
               | global email spam, or approximately 6.5 trillion unwanted
               | messages per year, begin with an automated bot harvesting
               | email addresses via the Tor network.
        
               | Zak wrote:
               | There are probably more sophisticated options that would
               | solve your problems than simply blocking it.
        
               | plumeria wrote:
               | Is using CAPTCHAs one of those?
        
               | judge2020 wrote:
               | Such as?
        
               | cowtools wrote:
               | The answer depends on the type of service you host. I
               | don't know what you need to do, but I do know that
               | filtering IP space is merely security-by-obscurity, it is
               | a cheap and broken solution to the hard problems of sybil
               | resistance. If you need IP filtering to operate on a day-
               | to-day basis, then the security of your service is
               | fundamentally broken.
               | 
               | Tor users do not have any special properties over clear-
               | net users besides low accountability for their IP space.
               | There are other ways to acquire this type of setup that
               | don't involve broadcasting a public list of known exit
               | nodes as an act of good faith. Any sophisticated attacker
               | will be able to easily get ahold of the IP space and
               | bandwidth they need to do their work, whether it's
               | through a botnet or simply because they operate out of
               | some less-accountable country like China or Russia.
               | 
               | IP filtering: now you have two problems!
        
               | thaumaturgy wrote:
               | This is why I'm strongly against spam filtering for
               | email. Spam filters are fundamentally security-through-
               | obscurity. I mean, they don't protect your email from
               | targeted bombing attacks or phishing. If you need spam
               | filters to operate your email on a day-to-day basis, then
               | the security of your email is fundamentally broken.
               | 
               | /s, obviously, I hope.
               | 
               | Blocking Tor isn't a security measure, it's a nuisance
               | reduction measure.
        
               | plumeria wrote:
               | How often is the list of exit nodes updated?
        
               | thaumaturgy wrote:
               | Daily, I believe. I don't have the file git-controlled.
               | That would be a good idea, though.
        
             | andrewnyr wrote:
             | there are many solid competitors: Amazon, Fastly, Akamai,
             | Imperva to name a few
        
               | wahnfrieden wrote:
               | Bunny
        
           | thaumaturgy wrote:
           | Just 10 minutes ago, I got the following email from a
           | housemate (I'm not home at the moment):
           | 
           | > _The past few weeks I 've been getting tons of redirects to
           | verify my humanity before being allowed to view a webpage.
           | Usually I just have to click the box that says human, not
           | find all the ladders in a photo. SoFi is doing it every
           | single time I log in. Petco, too, along with others who are
           | more sporadic. This is happening with and without uBlock on.
           | Same browser I've always used. ..._
           | 
           | SoFi and Petco both use Cloudflare. I do exactly zero web
           | crawling / scraping / abusive anything from my home
           | connection.
           | 
           | I'm noticing a recent increase in volume of complaints about
           | Cloudflare's human verification filter. I'm starting to
           | wonder if they touched a dial.
           | 
           | I had already started pulling some infra back from Cloudflare
           | after their last appearance in the tech news cycle. Now I've
           | got an additional reason to continue doing that.
        
             | patrec wrote:
             | > I had already started pulling some infra back from
             | Cloudflare after their last appearance in the tech news
             | cycle.
             | 
             | What triggered your reaction? That they terminated a
             | customer with zero notice?
        
           | tarakat wrote:
           | You're looking at it all wrong. From Cloudflare's point of
           | view, this kind of blocking is a _feature_. Anyone doing
           | legitimate web crawling, or offering alternative web services
           | such as Starlink, now needs Cloudflare 's permission.
           | 
           | Essentially, for a broad class of web-based businesses, they
           | have made themselves gatekeepers. I'm sure they'll find a
           | profitable use for this position. Charging outright would
           | look bad, but investing in businesses that just happen to not
           | run into Cloudflare-based trouble, but whose competitors
           | do...
        
             | tomxor wrote:
             | I'm familiar with that perspective, and biased towards
             | it... Cloudflare is certainly in such a position, but they
             | are a relatively young company (for their size and reach)
             | and I've seen good things come from them.
             | 
             | I'd guess the intent is unlikely to be anti-competitive or
             | monopolistic, just over-aggressive. However regardless of
             | intent their position does cause an absence of market
             | forces to put pressure on fixing such issues - Similar to
             | how it's become acceptable to have downtime when it's on
             | AWS, because "everyone is affected".
        
         | O__________O wrote:
         | They do have a threat score
         | 
         | https://developers.cloudflare.com/firewall/recipes/block-ip-...
         | 
         | I was surprised to learn Cloudflare was born out of Project
         | Honeypot, so I am guessing Cloudflare does share data with
         | them:
         | 
         | https://www.projecthoneypot.org/cloudflare_beta.html
        
           | [deleted]
        
           | elcomet wrote:
           | FYI you're responding to the cloudflare CTO
        
             | trasz wrote:
             | It's naive to assume Cloudflare CTO would not be lying if
             | beneficial to him or Cloudflare.
        
               | elcomet wrote:
               | I don't assume anything. The previous comment was just
               | trying to teach something about cloudflare to its CTO
        
               | nemothekid wrote:
               | I wonder if HN posters have ever held a job before. Can
               | you explain why it's beneficial for Cloudflare to block
               | legitimate users? Why is the simplest explanation
               | "Cloudflare just hates this one user in particular?"
        
               | lmm wrote:
               | Well, apparently they scared this user into installing
               | their browser extension, so it sounds like this incident
               | was a win for them.
        
               | Veen wrote:
               | It's even more naive to assume Cloudflare's CTO would
               | tell lies that can be trivially shown to be untrue.
        
               | trasz wrote:
               | How would you show they are untrue? Ask? :-D
        
               | pessimizer wrote:
               | Don't use an assumption of someone's superiority solely
               | based on their job title as a justification for the
               | silencing of the disagreement of others.
        
         | d2wa wrote:
         | > That's not what Bandwidth Alliance is at all. It's about
         | reducing or eliminating egress fees between a cloud provider
         | and Cloudflare. Not sure where the idea that it's about sharing
         | IP reputation data comes from.
         | 
         | It comes from the Cloudflare blog.
         | https://blog.cloudflare.com/cleaning-up-bad-bots/
         | 
         | There's a support page about it too.
         | https://developers.cloudflare.com/bots/get-started/free/
        
           | jgrahamc wrote:
           | I need to look into that. Thanks for pointing it out. I had
           | totally forgotten about that post.
           | 
           | Edit: team tells me this idea never got off the ground. Did
           | talk with some potential partners (which did NOT include
           | Google) but didn't happen. So if Google was throwing CAPTCHAs
           | it wasn't because of our IP reputation.
        
             | d2wa wrote:
             | Dear John. What am I -- as a normal human being/end-user --
             | supposed to do in this situation? People can't do anything
             | without any information about why they're blocked. Who do
             | you contact? Where do you go? What to do? The challenge
             | page doesn't help the end user understand why this is
             | happening to them. It's okay if you only see it for two
             | seconds. But the page stays on screen for over a minute.
             | When this happens for every website -- what do you do?
             | You'd be furious if this had happen to you. I'm just trying
             | to read my online comics and lookup some stuff about some
             | interests and hobbies. It reduced my quality of life/sanity
             | for a week. The last two days, I started worrying that this
             | was going to be the new normal. I even looked into swapping
             | ISP to get a new IP address.
             | 
             | PS: I love all the innovation and engineering stuff you
             | guys regularly share on the Cloudflare blog. It's [almost]
             | always an interesting read. Even though I'm no fan of the
             | massive centralization your company has caused.
        
               | JohnFen wrote:
               | > People can't do anything without any information about
               | why they're blocked. Who do you contact? Where do you go?
               | What to do?
               | 
               | This is the most serious problem with all of the major
               | companies these days. Cloudflare, Google, Apple, etc.
               | When you get on their "bad side", you're just screwed.
               | You'll never even know what got them mad at you, and
               | there's nothing you can do to recover.
               | 
               | The only reasonable way to deal with this is to avoid
               | them all to the greatest extent possible. You have no
               | control over whether or not you deal with Cloudflare,
               | unfortunately, which makes them the worst of the lot.
        
               | adammartinetti wrote:
               | > It's okay if you only see it for two seconds. But the
               | page stays on screen for over a minute.
               | 
               | That doesn't sound right. You shouldn't see a loading
               | page for over a minute. If you're open to providing more
               | details privately I'd love to help troubleshoot. You can
               | drop me an email at amartinetti @ cloudflare.
        
               | jgrahamc wrote:
               | Once upon a time Matthew made us set the IP reputation of
               | every Cloudflare office to bad so that we experienced the
               | worst case scenario. Helped a lot.
               | 
               | I don't understand why you saw one minute block screens.
               | That's not right. Should be seconds.
               | 
               | I'm talking with the team about your other points.
        
               | tinus_hn wrote:
               | The main problem of course, and it isn't limited to
               | Cloudflare and I won't pretend to have the solution, is
               | that if you are caught in this kind of web, you have no
               | recourse but go public and hope the spotlight lands on
               | you. For every problem we see in an upvoted post there's
               | tons that nobody sees.
        
               | northwest65 wrote:
               | What about answering his actual question?
        
               | easrng wrote:
               | I haven't been getting challenges that last that long,
               | but I have noticed that the redesigned "security check"
               | challenge pages with the spinner do seem much slower than
               | the old design with the loader that was made of 3 orange
               | dots.
        
             | d2wa wrote:
             | I edited and added a second link to a support page that
             | mentions it too.
        
               | jgrahamc wrote:
               | Thanks. I'm talking with the team.
               | 
               | Edit: see comment above.
        
         | cvwright wrote:
         | You block this guy from the internet for a week --- for no
         | apparent reason --- and then you come in here with a nitpick
         | about how another related system works?
         | 
         | Really?
        
           | judge2020 wrote:
           | The point is that Cloudflare does not beam IP reputation data
           | to Google. If Google and CF are blocking this IP separately,
           | what's the chance there's some malicious device or hacked IoT
           | device on the network, participating in DDOS attacks or
           | unauthorized vulnerability scanning of random websites?
        
             | zinekeller wrote:
             | Yeah, if for example Spamhaus (which both Cloudflare and
             | Google consult) has detected that a subnet is bad then that
             | could be the cause.
             | 
             | Still, it doesn't excuse Cloudflare that there's no redress
             | if you are caught on a block or even a clue on what you can
             | do to reduce it (especially that Spamhaus do have redress
             | procedures).
        
             | cvwright wrote:
             | Fair point
        
             | pessimizer wrote:
             | According to another comment, it's a wrong point:
             | https://blog.cloudflare.com/cleaning-up-bad-bots/
             | 
             | > Once enabled, when we detect a bad bot, we will do three
             | things: (1) we're going to disincentivize the bot maker
             | economically by tarpitting them, including requiring them
             | to solve a computationally intensive challenge that will
             | require more of their bot's CPU; (2) for Bandwidth Alliance
             | partners, we're going to hand the IP of the bot to the
             | partner and get the bot kicked offline; and (3) we're going
             | to plant trees to make up for the bot's carbon cost.
        
               | judge2020 wrote:
               | I'm pretty sure this was for a situation like
               | Digitalocean themselves hosting a bot, but such IP
               | sharing very well might be currently (ab)used by
               | partners, if it's happening here.
        
               | jgrahamc wrote:
               | Yeah. I'm looking into that.
        
           | stefan_ wrote:
           | A wrong nitpick, even! Way to look like the asshole.
        
           | noasaservice wrote:
           | Given their business model is "Protect DDoS'ers (booters) so
           | they can DDoS sites so Cloudflare can sell DDoS-prevention
           | services", I wouldn't trust them one whit in doing the right
           | thing.
           | 
           | And frankly, if you want to dig deeper, just look at who they
           | have no problems having their free clientele as.
        
             | tshtf wrote:
             | Not sure why this is getting downvoted, it's completely
             | factual.
        
               | cma wrote:
               | Its like finding the worst videos on youtube and saying
               | that's their business model.
        
               | acdha wrote:
               | It makes a very broad claim which makes it sound like an
               | extortion racket but doesn't have anything to back it up.
               | I would bet that if it included some evidence it would
               | fare much better. For example, they have a ton of large
               | organizations which are customers. The very first
               | question the average reader is going to have is whether
               | it's really the case that these sites are predominantly
               | attacked by booter services which use Cloudflare for
               | hosting? That seems unlikely and as general rule here the
               | broader the claim the more people are going to expect you
               | to show that you did your homework first.
        
               | [deleted]
        
               | gusgus01 wrote:
               | The claim was discussed in this post:
               | https://news.ycombinator.com/item?id=32709329
               | 
               | Basically DDOS booters use Cloudflare to protect their
               | websites from competitors, since Cloudflare is one of the
               | best. The same people Cloudflare is protecting (and
               | claims to do so on an ethical neutrality basis) is
               | furthering the need for Cloudflare to exist.
        
               | acdha wrote:
               | Note that I'm not saying whether or not this is true,
               | only that a comment which links to something like that
               | will generally fare better than one which begs the
               | question.
        
       | kevingadd wrote:
       | I'm used to getting assaulted by Cloudflare's browser check
       | interstitials along with random Cloudflare and Google CAPTCHAs
       | because (presumably) I run Firefox and an ad-blocker instead of
       | vanilla Google Chrome. It's already tremendously inconvenient to
       | wait multiple seconds on many page loads and click 20 bicycles, I
       | can only imagine how infuriating it would be if every page load
       | started taking 60 seconds because your IP ended up on some random
       | algorithmic blacklist....
        
         | 20after4 wrote:
         | I use firefox and an ad blocker and I don't see these CAPTCHAs
         | ( except for a few rare instances that I can recall). Something
         | else must be going on to get you flagged.
        
       | leonfs wrote:
       | If you haven't done anything, someone else might have. Check your
       | router logs for strange devices and activity in your network,
       | also check your machine/s for malware.
        
         | d2wa wrote:
         | (Author here.) Plenty of logging of outgoing connections and
         | DNS. Nothing out of the ordinary.
        
           | Jamie9912 wrote:
           | Is your IP address listed on https://www.abuseipdb.com/ or
           | any other spam blocklists?
        
       | NelsonMinar wrote:
       | Cloudflare is a regular problem for Starlink users. We're on
       | CGNAT so users share IPv4 addresses. I see CAPTCHAs when using
       | Starlink ten times as often as on my other ISP. I don't think it
       | actually breaks things the way this article describes, it seems
       | like a gentler behavior, but it's annoying.
       | 
       | A few months ago I got on Akamai's naughty list (with my other
       | ISP) for some very light automated website downloading. That was
       | a straight block with HTTP errors and I had to use a proxy to
       | access the Web. It cleared up after a few days.
       | 
       | The lack of any user feedback or support for this situation is
       | really annoying. Reminds you how much power the CDNs have. It'd
       | be really bad if loading websites got as difficult as sending
       | email through all the layers of spam filtering.
        
         | Syonyk wrote:
         | > _Cloudflare is a regular problem for Starlink users. We 're
         | on CGNAT so users share IPv4 addresses. I see CAPTCHAs when
         | using Starlink ten times as often as on my other ISP. I don't
         | think it actually breaks things the way this article describes,
         | it seems like a gentler behavior, but it's annoying._
         | 
         | I've been noticing this too, and it's why Starlink remains my
         | secondary ISP/bulk transfer connection. If I had to drop one
         | connection, I'd drop Starlink for this reason alone.
         | 
         | There are some sites that I simply can't browse, and it's not
         | Cloudflare errors, either. Lowes, in particular, simply returns
         | error pages for anything but the main landing page on a regular
         | enough basis. Of course, my observed public IP changes so it's
         | not consistent, but it's genuinely annoying.
        
           | somedude895 wrote:
           | > If I had to drop one connection, I'd drop Starlink for this
           | reason alone.
           | 
           | Why are you using Starlink at all if you have other options?
        
             | Syonyk wrote:
             | Because my other connection is a 25/3 WISP link that mostly
             | doesn't. I generally see about 5/1 in the evenings, if
             | that.
             | 
             | I've had several area WISP connections, as there's no wired
             | infrastructure to my area, and they vary in quality. I work
             | full time remote, so I need two connections as a general
             | habit - I can work with one, but when that one is down for
             | a week straight, I have problems. I like being able to fail
             | over.
             | 
             | I typically keep one connection for "interactive" traffic,
             | and one for "bulk transfer/failover" - things like my local
             | Ubuntu repo mirror, offsite backup traffic, etc. And I can
             | fail to it if needed, which I do often enough.
             | 
             | On a good day, Starlink is far better than my WISP
             | connection, and I have some machines routed out it
             | persistently. On a bad day, I can't hit much from it,
             | because that particular public IP has been blocked from
             | large parts of the internet. It's very hit and miss, and
             | overall bandwidth has definitely dropped from the early
             | days, though reliability of getting packets where they need
             | to go is drastically improved.
        
           | cma wrote:
           | > I've been noticing this too, and it's why Starlink remains
           | my secondary ISP/bulk transfer connection. If I had to drop
           | one connection, I'd drop Starlink for this reason alone.
           | 
           | Could cloudflare legally charge them a bribe to captcha their
           | users less? It isnt good to have a company in this position
           | of power if so.
        
         | diebeforei485 wrote:
         | Cloudflare said they're working on this-
         | https://blog.cloudflare.com/eliminating-captchas-on-iphones-...
        
         | ThatPlayer wrote:
         | I feel like Starlink could at least partially mitigate this by
         | supporting IPv6. T-mobile US supports IPv6, and I hardly notice
         | this as an issue on my phone. Or the time my work ran the
         | business over a 4G mobile while waiting for ISP install.
        
         | causi wrote:
         | What archival tool were you using? I've been looking for a
         | replacement for HTTRACK forever.
        
           | NelsonMinar wrote:
           | A combination of shotscraper and metascraper; really more web
           | previews than archives. And in a single thread, to different
           | hostnames, maybe one every 10 seconds? Honestly surprised
           | Akamai or anything even noticed. I fake my user agent now,
           | lesson learned.
        
         | justoreply wrote:
         | But any automated tool won't work. I have a similar problem
         | with my self hosted feed reader, my vps hosting ip doesn't have
         | 100% reputation with Cloudflare and I can't download some feeds
         | 
         | Edit: spelling
        
       | btdmaster wrote:
       | "The data subject shall have the right not to be subject to a
       | decision based solely on automated processing, including
       | profiling, which produces legal effects concerning him or her or
       | similarly significantly affects him or her."
       | 
       | However, this does not apply if:
       | 
       | "is necessary for entering into, or performance of, a contract
       | between the data subject and a data controller;"
       | 
       | Cloudflare would therefore perhaps claim that this is
       | "necessary".
        
       | grishka wrote:
       | Here's a handy list of correct uses for IP addresses:
       | 
       | 1. Packet routing
       | 
       | In other words, I wish services like Cloudflare were made
       | illegal.
        
       | scarface74 wrote:
       | Notice that he suspects that some of the problems with podcast
       | rss feeds and assets that can't be captcha confirmed may be
       | caused by websites who are on the free tier and that don't have
       | the ability to specify that some subdomains shouldn't be blocked
       | by captchas.
       | 
       | I have absolutely no sympathy for website owners who are
       | depending on a free service.
        
       | ritcgab wrote:
       | What is Cloudflare? The answer is simple - the biggest MITM on
       | your Internet traffic.
        
       | joshfraser wrote:
       | If this happened to me, the first thing I would do is switch to
       | using a VPN. In my experience, Google is far more likely to throw
       | up CAPTCHA challenges to VPN users. I wonder if this is what
       | happened to the OP.
        
       | superkuh wrote:
       | Daniel Aleksandersen of ctrl.blog has absolutely no foot to stand
       | on here. He is a proponent of this kind of algorithmic blocking
       | for weird browsers and even implemented it on his own site and
       | argued _for_ it. https://www.ctrl.blog/entry/detect-non-browser-
       | form-submissi...
       | 
       | It's only after it happened to _him_ that now he 's suddenly
       | against it. Until he removes the same type of blocks from his own
       | website I have absolutely no sympathy for him.
        
         | bergwerf wrote:
         | From the link you mentioned:
         | 
         | > Bots often mimic the User-Agent of a common browser, but the
         | version numbers used in the bots rarely change. Over time they
         | drift farther and farther behind until a point (maybe two-year-
         | old versions) where you can safely block them without
         | inconveniencing legitimate users.
         | 
         | This supports the idea that browsers are subject to constant
         | change and everyone should be forced to come along (rather than
         | respecting and supporting standards). I have a Chromebook that
         | stopped receiving updates some years ago (thank you for your
         | very safe and sustainable product Google!), his heuristic would
         | litteraly block me.
        
           | ReptileMan wrote:
           | Doesn't your chrome app updates? Never used chromebook. Just
           | asking.
        
         | phreack wrote:
         | Even if that were the case (which we can debate), him being
         | wrong before does not prevent him from being right now. Being
         | de facto banned from the common internet due to centralization
         | is absolutely scary.
        
           | ranger_danger wrote:
        
           | superkuh wrote:
           | I completely agree. I am against Cloudflare and the
           | centralization it implies 100%. I never use it for sites I
           | develop.
           | 
           | I just have no sympathy for Daniel since up until just now he
           | was trying to get everyone to do this.
        
             | scarface74 wrote:
             | CloudFlare allows website host to have much finer grain
             | control that would have solved many of these problems - _if
             | they pay for it_. I see no problem with this.
        
               | dmix wrote:
               | The hosts aren't blocking him though, it's Cloudflare.
               | 
               | > Just about every website I visited from my home
               | internet connection would result in a challenge page.
        
               | scarface74 wrote:
               | Cloudflare is blocking him because the hosts didn't
               | configure Cloudflare to not use captcha for sub domains
               | that host non browser traffic like podcast RSS feeds.
               | That was his theory.
               | 
               | That capability is only available for paid CloudFlare
               | plans.
        
           | bashinator wrote:
           | It's almost as though sufficiently large communications
           | providers should be regulated as utilities.
        
         | daenney wrote:
         | Burn the witch!
         | 
         | Lets read through that page for a second though:
         | Drop support for obsolete HTTP versions
         | 
         | Doesn't seem like that's going to cause much issue for any
         | legitimate client from the past 10-20 years. He only recommends
         | blocking HTTP 0.9/1.0, which fair enough                 Append
         | a #hash to the form's action URL
         | 
         | Hah. Clever man. I don't see how this is going to stop any
         | legitimate user from loading your website or submitting the
         | form, but I can see how it might frustrate bots.
         | Include a hidden prefilled form field
         | 
         | This is just standard practice to mitigate CSRF.
         | Verify the Host and Origin request headers
         | 
         | Yes. You should be doing that.                 Set a test
         | cookie and verify it gets included in the submission
         | 
         | Another CSRF trick.                 Swap the name attributes in
         | the name and email fields
         | 
         | This one's a little user hostile to folks who use assistive
         | devices like screen readers. But still won't prevent you from
         | accessing the site in the first place.                 Verify
         | the POST/Redirect/GET (PRG) chain
         | 
         | As noted by the author, might cause some issues but again,
         | won't stop anyone from loading your website.
         | Block ancient versions of common browsers
         | 
         | Alright please just don't do this. UA blocking is gross and
         | might prevent access through specialist software. But he also
         | calls this out himself.                 I strongly discourage
         | you from blocking or discriminating against unknown or uncommon
         | browser User-Agent request headers
         | 
         | All in all, with the exception of UA blocking I don't see how
         | any of these mitigations would result in users not being able
         | to access said website, or having their loading times
         | drastically increased.
        
           | d2wa wrote:
           | >> Verify the Host and Origin request headers > > Yes. You
           | should be doing that.
           | 
           | (Author here.) If I remember correctly, his browser of choice
           | predates the Origin header.
        
             | daenney wrote:
             | Alright well fair enough. Looks like that's only been
             | supported since Fx 70 released somewhere in 2019. So maybe
             | don't do that depending on what you intend to block. But
             | then again it's been 3 years also.
             | 
             | In general though the whole tone of parent of "I am owed
             | access to someone else's computer system on my and my terms
             | alone" just doesn't jive with me. It's also not remotely
             | comparable to Cloudflare's approach of sitting in the
             | middle snd then appropriating end-user compute resources
             | without their consent to fuel their business.
        
           | ceejayoz wrote:
           | > This one's a little user hostile to folks who use assistive
           | devices like screen readers.
           | 
           | As long as you're using a <label> or aria-label attribute,
           | that shouldn't be an issue.
        
             | d2wa wrote:
             | (Author here.) I am. There's plenty of accessibility labels
             | in place. It's literally just the name attributes. No user
             | ever sees this, whether they're using accessive
             | technologies or not. It only confused bots that assumes
             | that the field named email is for the email address.
        
           | nijave wrote:
           | All that stuff is easily defeated by automated browsers
           | anyway (i.e. selenium)
        
             | mh- wrote:
             | Yes, but those automated browsers are much more expensive
             | to operate than simple HTTP clients _pretending_ to be
             | browsers.
             | 
             | It's an arms race/defense-in-depth situation. If someone
             | truly wants to automate your site in a _targeted_ fashion,
             | and it 's profitable for them to do so, you'll have to
             | invest a lot more in stopping it (and decide how much of it
             | is _worth_ stopping).
        
               | Aperocky wrote:
               | Even youtube fails with yt-dlp going as far as a internal
               | python file that parses javascript and execute them.
        
           | LinuxBender wrote:
           | He's quite tame compared to me I suppose. I block anything
           | that is not HTTP/2.0 which _currently_ knocks out all the
           | bots and all crawlers except Bing. But I just have hobby
           | sites these days. Nobody would notice or care if my sites
           | went offline.
           | 
           | Using NGinx as an example:                   if
           | ($server_protocol != HTTP/2.0) { return 403 'Nope'; }
           | 
           | Another thing I have found useful to drops some bots is to
           | become invisible to them. Many of the poorly written scanning
           | tools do not properly set MSS for reasons I still don't
           | understand. I use this to my advantage.
           | 
           | Using IPTables as an example:
           | /sbin/iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp
           | --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 420:16384
           | -j DROP
           | 
           | Any TCP packets setting a very low or high MSS or missing MSS
           | will be silently dropped. I drop about 35K packets per host
           | per day on average. This also drops hping3 floods.
        
             | toast0 wrote:
             | > Many of the poorly written scanning tools do not properly
             | set MSS for reasons I still don't understand.
             | 
             | MSS issues attract me like a moth to flame [1], so let me
             | ask some questions.
             | 
             | It looks like this is dropping syns with MSS over 16384???
             | That is indeed a pretty crazy high number. 9000ish seems
             | reasonable for someone on a jumbo network without a mss
             | clamping router, but above that is someone weird for sure.
             | 
             | Under 420 seems unlikely too, but technically acceptable,
             | but sure, I'd drop it. In theory, a proper OS will send
             | several SYNs with MSS, then assume your server doesn't
             | support TCP options and send you a SYN with no options.
             | Going to take a while, but if someone legitimately has a
             | mss less than 536, their internet is probably pretty junky
             | anyway, so ok, seems fine.
             | 
             | [1] I just built a browser based pmtud test site,
             | http://pmtud.enslaves.us/
        
               | LinuxBender wrote:
               | You are right. I just happen to use a very safe range. If
               | I didn't care about anyone using jumbo frames I could set
               | the range to 1220:1536 and nearly all legit traffic would
               | pass just fine. 1220 (to 13xx) for the people using VPN's
               | and ip6-ip4 gateways. I just try to give really
               | conservative examples so that it is less likely I break
               | someones unusual setup. Anything just over 9k is fine for
               | most jumbo-frame setups.
               | 
               | All of this said, I could set the range to 1:65536 and it
               | would still drop most bots as they don't even bother to
               | set MSS at all in their scans. I'm not sure which tool
               | they are using.
        
             | judge2020 wrote:
             | IMO blocking bots isn't too big of a concern, the problem
             | is when a dedicated attacker realizes you serve valuable
             | data (in your HTML). Next thing you know, they're running
             | puppeteer or a similar remote controlled browser to scrape
             | your site, which is both undesirable in itself and the
             | scraper might overload your site/database by scraping with
             | no internal parallel request limit. If you're not a startup
             | with an unlimited early cloud budget, it can be costly if
             | you want to handle both bot usage (including official API-
             | based or scraping bots) and regular users.
        
         | ceejayoz wrote:
         | The techniques described in that article are pretty reasonable
         | and shouldn't significantly impact users - swapping name/email
         | fields' names won't do a thing to you. There's also a
         | difference between "this one website doesn't work for me" and
         | "I've been blocked from half the Internet".
        
         | IshKebab wrote:
         | To be fair there's a difference between doing it for one site,
         | and doing it for a significant portion of the internet.
        
         | phantom_of_cato wrote:
         | Throwing an ad hominem is not cool.
        
         | [deleted]
        
         | jamespo wrote:
         | None of those techniques affect normal browsing
        
         | ufmace wrote:
         | I just read it, and I don't see any contradiction here. IMO,
         | he's recommending simple and direct anti-bot methods to web
         | admins specifically because it's better than relying solely on
         | Cloudflare etc for all bot blocking. He never recommends making
         | un-appealable access control decisions based on third-party
         | lists, and specifically recommends caution on methods that
         | might potentially impact innocent users. Seems perfectly
         | consistent to me.
        
         | ldoughty wrote:
         | I don't know the author or his reputation, but his suggestions
         | that you linked are (in my opinion) standard actions for any
         | dev/server admin getting spammed by their forms... And the
         | suggestions really only impact malicious actors accessing your
         | website from a script... Virtually none of those would be an
         | issue for any browser made in the last 15-20 years, or headless
         | browsers, but would break rudimentary scripts like entry level
         | hackers/spammers might use.
         | 
         | He also specifically called out CAPTCHA as user-hostile.
        
           | superkuh wrote:
           | I guess like ctrl.blog you can't grasp the significance of
           | the issue until it happens to you. My firefox fork is
           | definitely blocked by his algorithmic "bot" detector. Just
           | because your browser isn't doesn't mean it only blocks bots.
           | 
           | False positives happen. They happen a lot more than you
           | think. And they are a serious problem. Even more serious when
           | it's cloudflare, but arguing for everyone to implement these
           | algorithmic blocks "that won't inconvenience users"
           | individually, taken to it's logical end, does the same.
        
             | [deleted]
        
             | ldoughty wrote:
             | I don't see the reason for the personal attack.
             | 
             | The blog post also calls out that you should not block
             | based on user agent.
             | 
             | If a form post didn't respect the action property having a
             | #, that name/email HTML names might be reversed (whole the
             | type is correct, and the user displayed values are
             | correct), or include hidden HTML form fields that have been
             | standard since ~97? Back when I made my first few websites,
             | I certainly would agree that they are likely bots.
             | 
             | Again, apparently this person has some hateful following,
             | but I don't appreciate you limping me into this hatred for
             | agreeing with his statements on this one particular issue.
        
               | superkuh wrote:
               | You said, "And the suggestions really only impact
               | malicious actors accessing your website from a script."
               | and that was false. Since you didn't have experience
               | being blocked you couldn't know. Not till it happens to
               | you. I don't think pointing this out is a personal
               | attack. It's just the way people work. People don't
               | believe things are a problem until they become a problem
               | for them.
               | 
               | You and others can keep quoting the legit and clever ways
               | to mitigate bot spam but if you ignore the false
               | positives the other checks create it kind of defeats the
               | point.
        
         | scarface74 wrote:
         | > I strongly discourage you from blocking or discriminating
         | against unknown or uncommon browser User-Agent request headers.
         | The web is weird and we as developers shouldn't discourage it.
        
         | [deleted]
        
       | Melatonic wrote:
       | As much as I like Cloudflare now this is why long term monopolies
       | (not saying they are now) are bad
        
       | bastardoperator wrote:
       | Has he tried unplugging the router for 15 minutes and plugging it
       | back in? I jest but I know Comcast and Spectrum will both issue a
       | new IP address in that timeframe.
        
         | d2wa wrote:
         | (Author.) My ISP only rotates IPs when they reboot their
         | central equipment. Not enough to do it on my end.
        
           | bornfreddy wrote:
           | With some ISPs, they will issue a new IP if you change the
           | router's (WAN) MAC address. Might be worth a try next time
           | (crossing fingers you don't need it).
        
             | bastardoperator wrote:
             | This is what I've always seen too. I've never seen a
             | residential ISP that allocates static DHCP addresses, they
             | typically allocate in days which is why many people can
             | maintain a leased address for months on end. Once you go
             | offline though, all bets are off. Every ISP can determine
             | if the subscriber is disconnected and if they are, they're
             | going to reallocate your address. To your point, once the
             | MAC address is changed, they have to issue a new IP address
             | because using the logic posted above, the other address is
             | allocated to a different MAC.
        
         | dmix wrote:
         | IP bans by modern services like CF can't be solved that easily
         | in my experience.
        
           | bastardoperator wrote:
           | Clearly CF has a crystal ball /s.
           | 
           | Once the IP address I don't own is released and assigned to
           | some other router how do you think CF determines the new IP
           | address for the individual/home? Unless this person is
           | running the CF Dynamic DNS service which gives CF the IP
           | address, I'm not sure CF would have any reasonable validation
           | techniques to determine who is what given the size of
           | residential networks.
        
             | aendruk wrote:
             | Cookie on their validation page? Browser fingerprint
             | hopping IPs in the same block?
        
               | dmix wrote:
               | Bingo
        
               | bastardoperator wrote:
               | So i've turned cookies off and switched to my ipad to
               | browse the internet for the evening, they have no
               | fingerprint, and no cookie... now what?
        
               | dmix wrote:
               | Are you on a different IP block? ISPs sometimes just
               | switch the last number.
               | 
               | I had to use a VPN (a whole new IP) and clean chrome
               | install to bypass one those "IP blocks" which was
               | combined with fingerprinting.
        
       | bbu wrote:
       | I think cloudflare updated their bot detection algorithms because
       | we had multiple customers who complained that they get
       | challenged. I verified that they got a bot score of 1. As usual,
       | CF support is not that helpful...
        
       | synthetigram wrote:
       | Reputation systems should be based on /abuse/, not on automation.
       | I also ended up on the naughty list for running an archival
       | scraping program. Trying to preserve part of the Internet is
       | apparently against the rules. It's really a shame because my code
       | honors rate limits, doesn't spam, and is completely docile.
        
       | socialismisok wrote:
       | Is it plausible some ISP shared some IP address that was on
       | Cloudflare's list of suspicious IPs, or that some IoT device on
       | this person's network created a burst of suspicious traffic?
       | 
       | I get that this sucks for the end user, but I wonder how much we
       | should blame Cloudflare vs the wider systemic challenges of
       | managing DDOS protection on the web.
        
         | laxis96 wrote:
         | I believe that might happen, but then I also believe it's the
         | ISP's responsibility to ensure that its IP addresses are kept
         | clean
        
           | socialismisok wrote:
           | For sure, the point I'm making is that there's a multi party
           | transaction here, with systemic complexity. Makes it hard to
           | pin responsibility on just Cloudflare (or just the user or
           | just the ISP, etc).
        
             | yjftsjthsd-h wrote:
             | Cloudflare is the one blocking a user based on things that
             | aren't their fault; I'm happy to blame them.
        
               | socialismisok wrote:
               | That's fine, but you are ignoring the broader picture if
               | you do. You've correctly identified a detail, but haven't
               | placed that detail in context.
        
               | yjftsjthsd-h wrote:
               | I'm not ignoring the context, I'm saying that it's
               | irrelevant. Cloudflare made the choice to block real
               | people based on factors outside of their control, and
               | then to market that product as a panacea; they don't get
               | to pass the buck, doubly so when they don't expose enough
               | information to let other people fix the things they
               | broke.
        
       | kazinator wrote:
       | > _For whatever reason, I must have done something that angered
       | Cloudflare_
       | 
       | I'm guessing: having an IP address close to (or outright reused
       | from and thus identical to) someone malicious, whom you know
       | nothing about.
        
       ___________________________________________________________________
       (page generated 2022-09-20 23:01 UTC)