[HN Gopher] Moving from reCAPTCHA to hCaptcha ___________________________________________________________________ Moving from reCAPTCHA to hCaptcha Author : migueldemoura Score : 283 points Date : 2020-04-08 13:01 UTC (9 hours ago) (HTM) web link (blog.cloudflare.com) (TXT) w3m dump (blog.cloudflare.com) | yjftsjthsd-h wrote: | _Well._ That 's probably fantastic news; using ReCAPTCHA (and | thereby making users subject to Google's tender mercies) was | honestly my main reason to dislike cloudflare from a user's | perspective. ReCAPTCHA is utterly foul; it follows you | _everywhere_ it can, exists to undermine privacy, punishes non- | Chrome users, and throws you in an infinite loop when it decides | that you 're not a human. | curiousgal wrote: | I don't blame reCAPTCHA for existing, I blame Cloudfare for | using. It made using Tor literally impossible. Hopefully this | will be better. | lol768 wrote: | Didn't Privacy Pass help here? | noncoml wrote: | IMHO CPATCHA is a lazy way to protect your service as you shift | the burden to your users. | | Maybe if you are big and essential for some users, you can afford | that. But if not, be aware that users will turn their back on you | if you add obstacles between them and your service. | | Edit: meant to say "be aware that _some_ users will turn their | back to you" | gurrone wrote: | Yes it's trade off as usual. The main benefit I see is on | networks where you've a mix of good and bad traffic and you | would still like to offer the service to the few good users. I | see this a lot on networks hosting a lot of free VPN providers. | The other option we choose before was outright blocking. That | is even more harmful for the few good users. | onion2k wrote: | _But if not be aware that users will turn their back on you if | you add obstacles between them and your service._ | | You have to balance that against how many users you'd lose if | the site was down/vandalized/compromised by an attacker if the | captcha protection wasn't there to keep it out. | | It's often worthwhile moving the captcha away from the initial | login or signup form and only putting it on the second or third | attempt to login, or on features that put significant load on | the server. | hombre_fatal wrote: | > It's often worthwhile moving the captcha away from the | initial login or signup form and only putting it on the | second or third attempt to login | | Though if your service is a lucrative target for {uname,pass} | combolist spam, you'll see that each attempt comes from its | own IP address and only makes that one request. It's pretty | sobering. | eythian wrote: | I run a small forum, and it was getting flooded with fake and | spam accounts, the moderators were struggling to keep up and | the users were finding it annoying. So I put a captcha on the | registration page. The problem went to zero, new users still | showed up, and more people were happier than before. | cdubzzz wrote: | > IMHO CPATCHA is a lazy way to protect your service as you | shift the burden to your users. | | What is the non-lazy solution to having a basic website contact | form that _doesn't_ receive hundreds of spam submission per | day? | folmar wrote: | Filter the submitted content, not the sender. What Akismet | does seems to work really well and not push back on the users | too much. | thewebcount wrote: | Honest question - if you set it up so the user gets an email | with a link they have to click before their message is | actually sent to your queue, would that help? | | I'm thinking it would probably reduce the number of users who | successfully contacted you legitimately, but CAPTCHAs also do | that. Do spammers actually have the email accounts they claim | to and respond to confirmation emails? | cdubzzz wrote: | It would definitely help, I highly doubt spammers would use | that sort of mechanism. | | The solution gets around potential vendor lock-in and | privacy issues with a service like Google's, but it still | fundamentally shifts the problem from the service to the | user (the original commentor's gripe). | noncoml wrote: | "receive hundreds of spam submission per day" | | But this is exactly the point I am trying to make. That's the | service provider's problem and not the user's. CAPTCHA shifts | the problem to the user. | | CAPTCHA is a 00's idea, when we had the multiple page | registrations(with errors showing only after you submit the | page), the insane password requirements, etc.. It doesn't | belong to modern stack in my opinion. | | "What is the non-lazy solution?" That's how disruption is | born. | cdubzzz wrote: | > "What is the non-lazy solution?" That's how disruption is | born. | | So there is no non-lazy solution. | | I get your point about shifting the problem, but that's | kind of the only option for the vast majority of website | operators (particularly small ones). | | I have zero love for CAPTCHA myself, I have put time and | effort in to other, server-side solutions but none perform | even remotely as well. | kennydude wrote: | hCAPTCHA looks interesting, although it seems they use Blockchain | for no real reason compared to just storing the payments as rows | (i.e what they gain from being chained on top of another) | colejohnson66 wrote: | The point of a blockchain is that to edit an earlier record, | you would need to edit every record that comes after (due to | storing a hash of the previous block in the current block). | _However,_ it doesn't make sense when one entity controls the | entire system because if a hacker (or even an insider) can | change _one_ record, they could change _all_ of them. Hence why | a good blockchain would be _distributed_. Then, if one node | edits the history, the other nodes will see the anomaly and | ignore that node. | | This is also why Git's history is easy to edit when it's only | on your machine. But once you push to GitHub and others clone | your repo, it becomes a lot harder to edit history. Yes, Git | isn't a blockchain, but it does use the idea of hashing the | previous "block" (commit) and storing it in the current | "block." | speedgoose wrote: | Yes if do not you want to distribute your data with random | people over the internet, you need a Merkle tree. Not a | stupid blockchain with all the downsides a blockchain have. | [deleted] | wongarsu wrote: | If you strip out the proof-of-work algorithm you're | basically left with a chain of Merkle trees, and the | payloads hashed by the Merkle trees. Calling it a | blockchain is just a way to make it sound more familiar to | potential investors. | wongarsu wrote: | However you can do a local blockchain (or hash chain, or | whatever you want to call it) and distribute just the hashes. | If you have a local git repo and regularly tell me your | commit IDs I can testify that the code existed at that point | in time, and can later verify it wasn't changed if you choose | to expose the full commit to me. And because it's a chain, | you only need to communicate one commit ID for every external | timestamp you care about, not for every commit you care | about. | kennydude wrote: | Yup, that's my thing is that they control the entire thing. | Although it could be like joke where "AI" ends up being just | a bunch of "if statements". | devy wrote: | The enterprise grade hCaptcha[1] is not free either. Does anyone | have pricing information? | | [1]: https://www.hcaptcha.com/#plans | wongarsu wrote: | According to the article Cloudfront is paying, but is paying "a | fraction of what reCAPTCHA would have [cost]". Recaptcha is | $1/1000 challenges, so apparently hcaptcha is some small | fraction of that. | | Cloudfront might get a discount for running some of the | infrastructure on their own servers, on the other hand that | might also be an integration hassle that actually costs them | money. | meowface wrote: | > Recaptcha is $1/1000 challenges | | This seems unwise, because many captcha farms charge less | than this. A quick Google search shows one service offering | $0.50/1000 challenges. If it's 2x cheaper for an attacker to | solve a captcha than it is for a provider to display it, it | sounds like the attackers win. | devy wrote: | Good point! The economy of scale is one of the ways to | fight against spam bots | aaron695 wrote: | This only works if you are Soviet Russia vs the USA and | your plan is to ruin the other by draining their money and | you have equal pools of cash. | | Spammers don't want to hurt the company they attack if they | can help it, they need them! | | I don't understand why ReCAPTCHA cost so much though. A | human solving them is cheaper than a computer/human hybrid | creating them? | meowface wrote: | True, the attacker is much less likely to have anywhere | near the funds of the target, and they don't want to hurt | them. | | Regardless of the actual price multiple, it costing | anywhere near the price to serve as the price to solve | just seems to defeat the point. Really, it costing any | money per captcha served just punishes sites that happen | to face a higher volume of bots, even if they're a small | site. It's just going to push the company to switch to a | different captcha service, which may be even cheaper for | attackers to solve. | JakeTheAndroid wrote: | *Cloudflare | StavrosK wrote: | It says it's free for non-enterprises . | yjftsjthsd-h wrote: | Okay, but Cloudflare is very much an enterprise, and lots of | people here are working in such places, so it's a decent | point. | StavrosK wrote: | The parent edited their comment, it didn't say "enterprise" | when I commented. | Macha wrote: | It sounds like Cloudflare is paying at least partially in | free/discounted Cloudflare services. | worble wrote: | A little off-topic, but the article mentions they support Privacy | Pass. I remember seeing the announcement a little ways back when | they first released it but just kind of forgot about it. Is | anyone using the browser extensions? Has it reduced the amount of | captchas you end up seeing, or made your browsing experience | better in any way? | chrismorgan wrote: | A few days ago I encountered this when Cloudflare decided my IP | address (which is behind an ISP-level NAT) was suspicious all of | a sudden (which it hadn't been doing, a pleasant change from when | I was at this location three years ago when half the internet | sprouted Cloudflare CAPTCHAs at me). It was _awful_ to solve, | worse than the substantial majority of reCAPTCHA checks I've | encountered. Certainly _nothing_ like the illustrations in the | article. | IAmEveryone wrote: | I had the same experience. But this may just be an artefact of | humanity now having been trained exceptionally well to identify | traffic lights and busses, but being relative novices at | identifying elephants. | | And now I'm wondering if this may not be a spectacularly useful | tool to raise standards of education world-wide. Imagine, say, | the French government buying them and asking every person on | the internet twice a day to match some vocabulary to images: | Identify "le baguette"! _Lingua Franca_ , le sequel. | | Or a maps puzzle: "Please identify Equatorial Guinea, Papua New | Guinea, and Guinea-Bissau". | hbvvvvgff wrote: | I tried a hcaptcha and it was way harder to solve than the | usual recaptcha. However, It was significantly easier than the | recaptchas you get when using tor. | datafix wrote: | Hey, I interviewed with them a year ago. Their captchas are | actually harder than reCaptcha's. | kevindong wrote: | There are plenty of services that will happily accept a | screenshot from a developer, send it out to live humans who solve | it in real time, and then return the answers to the developer. | | I'm not going to link to them, but you can find them yourself by | googling "buy recaptcha solver". The prices for the top two | results are $0.50 and $1.39 per 1000 solves (respectively, | $0.0005 and $0.00139 per solve). | | At that price point, it's feasible for the truly determined to | just use those solvers to bypass ReCAPTCHA (or similar services). | outloudvi wrote: | 1. I think challenges from hCAPTCHA is harder than reCAPTCHA. | It's far and even further from human-friendly compared to | reCAPTCHA. | | 2. hCAPTCHA seems to be using the similar revenue model as early | stage reCAPTCHA and it even pay its users. I doubt that its model | is sustainable. | | 3. A huge company like Google may not be able to handle user data | well, so a small company will be able to? | aaron695 wrote: | So no one can turn free human labour into enough money to pay | hosting fees? | | And given spammers a lot of the time are messing with Google, | it's also in Google's interest to do this for free! | | What are they thinking? Is this one department make $100 | internally while killing $1000 in another internal department? | jccalhoun wrote: | I've ran into hCaptcha a couple times recently and found it vague | and I had to try to guess what they meant. Both times it asked me | to identify the truck. Well, what do you mean by "truck?" are you | counting a semi as a truck? I ended up having to do it twice | because I don't consider a semi a "truck" but they did. | tcd wrote: | It's funny that we need to ensure humans are the ones performing | certain actions like making a purchase or accessing a service, | but we let machines make decisions over very important matters in | our lives (credit/financial decisions). | | It's intriguing they said Google will charge for reCaptcha, any | information on that? I can't imagine all the small business | owners will have to start paying, but perhaps if they did they'd | just remove it altogether (a net win!). | rstupek wrote: | Did anyone notice that hcaptcha runs on top of etherium? | TechBro8615 wrote: | This is fantastic news for privacy on the web. Thank you | Cloudflare! | | I've been seeing hcaptcha in more and more places recently. It's | a bit rough around the edges still, but it works well and feels | far less hostile than recaptcha. | aeonflux wrote: | This is what I recently got on CF's HCAPTCHA (look closely): | https://imgur.com/a/QZNHmUC | alberth wrote: | I see 2 clear images of dogs. 2 possible dog images. And zebras | humping. | | Nice. | garaetjjte wrote: | Can we get back text based captchas instead of annoying whack-a- | mole photo picking? | theandrewbailey wrote: | No. Photos of street things are much easier to pick out than | warped or miscolored text. | hombre_fatal wrote: | Especially the amount of warping you apparently need to do to | text to make it hard for a neural network these days. | [deleted] | jasonhansel wrote: | Has anyone else seen reCAPTCHA getting way more difficult of | late? It often takes me a full minute to find all of the tiny | traffic lights hidden away in a set of low-quality images. | ship_it wrote: | Just use Buster[1] | | [1] https://chrome.google.com/webstore/detail/buster-captcha- | sol... | jsjddbbwj wrote: | What I don't like is that Buster doesn't work with hcaptcha... | | But I don't live in a shitty country so very very rarely do I get | captchas from Cloudflare. | [deleted] | elric wrote: | It's a start. reCAPTCHA is a notorious pain in the arse for | anyone whose browser isn't Chrome and for anyone who doesn't keep | cookies. I'm not sure if hCaptcha will be better, but it's hard | to imagine it being any worse. | rfoo wrote: | On the other hand, their new HCAPTCHA is a notorious PITA for | anyone, including those whose browser is Chrome and keep | cookies. | | Browsing an "I'm under attack"-mode website behind Cloudflare | has been super annoying for me since last week. To the point | that I usually close the page when I see a HCAPTCHA. Their | visual challenge is harder to navigate than reCAPTCHA, and | because this is their business model I suspect they have | incentive to make it easier. | heinrich5991 wrote: | Why do you assume that a good CAPTCHA should positively | discriminate Chrome users? | kevindong wrote: | I know this is a common complaint, but I personally have no | issues on both macOS and iOS Safari. | zelphirkalt wrote: | Perhaps the privacy problem for you is then one of the | following: | | - Ad blocking extension not installed or rules too lax - | Script blocking not enabled - no VPN used - stores tracking | Cookies | | If all of those do not apply to you, I would feel | discriminated against by Google, even more so, than usual. | kevindong wrote: | To address each of your points: | | 1. I do have an ad blocker installed, but it's not very | aggressive. | | 2. All scripts are enabled. I already have trouble with | some sites due to my fairly lax ad blocker. | | 3. I do not use a VPN (since it just transfers who is able | to see my traffic from one party to another). Additionally, | virtually every service provider penalizes VPN IPs to the | point where it's probably not worth the hassle. | | 4. Not sure what you mean by "stores tracking Cookies". | | --- | | > If all of those do not apply to you, I would feel | discriminated against by Google | | I do not agree with that (mostly because of point 3). The | reality is that VPN traffic is significantly more | "spammy"/bot-filled than non-VPN traffic. It's a perfectly | rational and justifiable way to protect sites (albeit | ReCAPTCHA is of dubious effectiveness). | zelphirkalt wrote: | I will not arguing against protecting ones website from | bots, nor am I saying, that VPN traffic is not spammy in | practice. Up until that point I am with you. However, | making use of ReCaptcha is certainly not an ethical and | therefore not a justifiable way of doing it. | | Doing all of the stated things these days has become a | minimum for protecting your privacy online. The current | situation is a quite bad for privacy conscious people. | Even if we only trust first party scripts and do not | allow them being loaded from a subdomain, which actually | has all the third party scripts again, we still face | issues, for example fingerprinting. | | I can only laud websites, which can be used completely | without third party scripts or perhaps even without | scripts at all, making sure it all works with REST, | offering alternatives, when scripts are blocked. | | It's good to see some "competition" in this area, even, | if I do not trust cloudflare either. More competition | means less Google monopoly. Hopefully in the long run it | will lead to better solutions for casual users. | rectang wrote: | You probably browse while logged in to your Google account. | Right? | kevindong wrote: | Yes. | Kalium wrote: | The hCaptcha website states that whoever publishes it will get | paid for the data labeling work users do. | | Not to wax cynical, this seems like it might not encourage | better behavior in every possible scenario. | sgc wrote: | I doubt it's enough money to tank peoples' traffic over it. | Kalium wrote: | I'm quite sure you're correct. When stacked against however | much Google was going to charge (I assume more than zero), | Cloudflare's incentives seem pretty clear to me. | [deleted] | 0xff00ffee wrote: | How can someone demonstrate this claim? | | reCaptcha is wildly sophisticated under the hood[1]. I use it | on all three major browsers and find the number of challenges | varies from 0 to 4: sometimes it says I'm verified without | doing anything, other times I need to go through 4 screens. | | I would love to see someone put some numbers behind this claim, | because I think it is false. | | [1] | https://www.blackhat.com/docs/asia-16/materials/asia-16-Siva... | | EDIT: Are you downvoting because you don't like reCaptcha, or | because you can't (or won't) set up an experiment to | demonstrate this claim and prefer to just jump on the | bandwagon? | zaarn wrote: | I've experienced reCaptcha simply looping forever. After | solving 5 or so screens, I give up and hope that reloading | the page works. If not I usually switch to Chromium, which | doesn't even get a single puzzle, just verified. | | That is my repeatable experience as the end user. | greglindahl wrote: | My heavily adblocked FF has a lot of trouble with recaptcha, | while the Chrome instance that I only use for logged-in | Google and LinkedIn doesn't. It seems like there are enough | moving parts that it would be hard to figure out why our | anecdotes are so different. | tgv wrote: | By now, I almost immediately close a page with a reCAPTCHA, | because the stream of buses, traffic lights, and cycles never | seems to end when you're using Firefox. And then it says "too | many requests from this computer" and refuses to continue. | dicytea wrote: | Does it help if you change your user agent? | tcd wrote: | I'm amazed Mozilla hasn't sued Google for discriminating | against their browser - I also use Firefox and suffer | endlessly using privacy tools. I can prove there are no more | busses and I'm 100% right, but I can predict 100% of the time | it'll say "please try again". | | The pattern seems to be 2/3 'right' guesses. on sites like | eBay, the captcha is broke on firefox. I complete it, and it | says "you need to resubmit this form again", and reloads the | entire page. | | That's the cost of privacy; broken pages and refused access | because Google says "NO!". | | And businesses are okay with Google denying them money. I | wonder if they did a cost/ben analysis if they find it | worthwhile. | | Thanks to Google, I've actually saved quite a bit of money, | they lost out hundreds recently when their automated systems | decided to refuse my transaction. Their loss and my gain. | rurp wrote: | >I can prove there are no more busses and I'm 100% right, | but I can predict 100% of the time it'll say "please try | again". | | I frequently run into the same issue of having correct | answers rejected, and have read posts from many others who | experience the same. At some point I started intentionally | picking random squares for the first couple image sets. | Interestingly, it doesn't seem to end up taking any more | submissions overall than when I try to pick the right | answers from the start. | | Plus, polluting Google's free work data set ever so | slightly gives me a small amount of pleasure. | _jal wrote: | > And businesses are okay with Google denying them money | | Make sure they know. I write to sites and tell them they | just lost a customer because Google doesn't give a shit. | I've gotten replies from smaller outfits that had no idea | what was going on. | jakear wrote: | I think their cost analysis would mark you as a bot that | got stumped by the captcha and thus a bet benefits. (Sales | to bots are worse than not selling, else they wouldn't | implement this at all) | zelphirkalt wrote: | Well said, could not have expressed it any better. I do the | same and refuse to use a website or service, that tries to | put me through this garbage. | zzo38computer wrote: | I have managed to successfully solve the audio CAPTCHA | before (even though the pictures are impossible to solve), | although now they must have disabled it because it doesn't | work. | fludlight wrote: | Google pays Mozilla to be the default search engine in | firefox. This is Mozilla's main source of revenue, so I | doubt they will sue. | jakear wrote: | I wonder why they don't negotiate with Msft to use Bing | or even DDG instead. Seems... incredibly odd... to put | oneself in a position where a third party is directly | antagonizing your users, reducing your user satisfaction | and likely dramatically increasing churn, but you can't | do anything about it because that same party is your main | source of funding. | | (Disclaimer, I work at msft. Nowhere near this though). | propinquity wrote: | Yahoo was the default from around 2014-2017 in the United | States. | jdashg wrote: | I'm not sure why you think they don't negotiate with | other search providers. | jakear wrote: | The fact that they're still on google even though google | is screwing over their userbase? I don't use Firefox | because of how difficult it makes captcha. There are | others like me. | | If they are negotiating with other providers, they | certainly aren't doing a very good job of it. | spsrich2 wrote: | I hate Hcaptcha. It keeps presenting the same challenge over and | over again. Everytime I need to access a site it protects it | wastes so much time. | [deleted] | _nickwhite wrote: | From the article: | | "We evaluated a number of CAPTCHA vendors as well as building a | system ourselves." | | and | | "We worked with hCAPTCHA in two ways. First, we are in the | process of leveraging our Workers platform to bear much of the | technical load of the CAPTCHAs and, in doing so, reduce their | costs. And, second, we proposed that rather than them paying us | we pay them. This ensured they had the resources to scale their | service to meet our needs. While that has imposed some additional | costs, those costs were a fraction of what reCAPTCHA would have. | And, in exchange, we have a much more flexible CAPTCHA platform | and a much more responsive team." | | So Cloudflare are basically cloud hosting hCAPTCHA's services. I | wonder why Cloudflare didn't just buy them, as it seems like it | would be a win-win with getting an excellent CAPTCHA service, and | not have to build it themselves? | beojan wrote: | I suspect that might happen eventually. | IAmEveryone wrote: | CF likes the CAPTCHA part of CAPTCHAS, but any vendor is | probably far more invested in the "generating ML training data" | scheme. | | CF probably has zero interest in that part of the product: It | doesn't fit with their existing products nor customers, and | it's just too small relative to their other business to devote | much attention to it. | | At the same time, the business opportunity is probably too | large for hCAPTCHA's founders to just forget about it, or for | CF to compensate them on the hot-new-technology assumption when | they're only looking for peace-of-mind-utility tech. | dathinab wrote: | At the end they mention that there long term goal is to | eliminate captchas fully if possible. | cinbun8 wrote: | > Earlier this year, Google informed us that they were going to | begin charging for reCAPTCHA | | So it came down to cost. | | > Over the years, the privacy and blocking concerns were enough | to cause us to think about switching from reCAPTCHA. But, like | most technology companies, it was difficult to prioritize | removing something that was largely working instead of brand new | features and functionality for our customers. | | I like that they're upfront about this. In most companies / teams | of this size, these issues are always swept under the carpet | until something ugly forces you to clean up at a later point in | time. It's just unavoidable. | alexnewman wrote: | Hey everyone. HCaptcha founder here. We are so happy to be on | hackernews. I'm curious if anyone is having any problems? We are | trying hard to respond carefully to customer requests but as you | can guess we are very busy. Also we are hiring :) | [deleted] | cm2187 wrote: | > _But, sometimes, when we 're not 100% sure if something is | malicious or good we issue it a "challenge"._ | | I think they meant "bot or human", not "malicious or good". Bot | != malicious. And these challenges will do no good to non | malicious bots. | [deleted] | lucideer wrote: | I think you're confusing intent with implementation. | | You're right that the implementation excludes non-malicious | bots and fails to solve for malicious humans, but that just | makes it an imperfect implementation of the intent: which is to | differentiate malicious & good. | Legogris wrote: | Apart from the surveillance aspect, one thing that bothered the | hell out of me with Cloudflare using ReCAPTCHA was that it | yielded a much larger part of the web than necessary effectively | blocked in China, since the CAPTCHAs would get triggered, and not | load, from Chinese IPs. | | I had a customer where we had to migrate away from Cloudflare for | this reason - this was about 5 years ago and the issue has been | there to this day. Glad to hear they've finally done something | about it. Even if it took Google starting to charge money for | ReCAPCHA to trigger it. | shp0ngle wrote: | > Earlier this year, Google informed us that they were going to | begin charging for reCAPTCHA. | | Wait. Is this news? I don't see other article about this. What is | the pricing? | zachware wrote: | One of the more insidious elements of ReCAPTCHA is its propensity | to challenge users who have robust cookie blocking in place. So | as we encourage people to be more privacy-aware, the web gets | harder and harder to use. | | We've seen ReCAPTCHA pop all over ecommerce, all over benign | websites with little to no need to challenge use almost | completely because of the increase in privacy-aware users. | | ReCAPTCHA essentially flies in the face of the recent blocking | features rolling into Safari and Firefox and more privacy-aware | users...growing by the day. | | In many ways it's a genius structure from Google. 1. Convince | people to use your privacy challenge. 2. Serve it when you don't | see Google tracking cookies. 3. Offer a way around that with the | least privacy-aware browser available (Chrome use is growing | steadily month over month. | | So good on Cloudflare. | noad wrote: | You're forgetting the main benefit for google, which is getting | humans to train all their vision models for free. At one point | they were just forcing X% of clicks to fill out a captcha | regardless of origin or identity just to get more data. | | I for one am getting quite tired of trillion dollar | corporations getting things for free out of me. Hard pass. | weinzierl wrote: | > You're forgetting the main benefit for google, which is | getting humans to train all their vision models for free. | | Is this still true? I keep seeing the same type of images for | years and there might be 7 or 8 different categories but | that's it. To me reCaptcha looks like a service well in its | maintenance phase. If it was actually in use for training | purposes you might expect images to match a wider range of | tasks. | krut-patel wrote: | I could swear I've seen challenges with night scenes (low | light conditions) in the wild. Those were definitely not | present earlier. | airstrike wrote: | I've lost track of how many times I've had to read house | numbers from Google Street View... | sli wrote: | I haven't gotten one of those in years. These days it's | just picking out buses, cars, traffic signals, and | sometimes motorcycles. Maybe once in a while it'll ask | for storefronts. | eythian wrote: | Most of mine lately have been traffic features also. This | is a little tricky in some cases, e.g. with crossings, as | it sometimes gives me things that I don't think are | crossings but it insists I select, perhaps they are in | the US, or the perspective is weird, or someone else has | told it that a series of white squares is a crossing and | it requires me to agree. | grishka wrote: | Except in this wonderful new world, you don't get the choice | to "hard pass". As someone whose ISP has too few public IP | addresses, I see Cloudflare's "one more step" pages at least | several times a month. It's terrifying to realize just how | much of the internet is behind that thing right now. | derefr wrote: | If that was still the main benefit for them, they wouldn't be | planning to start charging for it, because that would--and, | as this article shows, has--cut off much of that data flow, | as reCAPTCHA clients abandon the service for another one that | _isn't_ charging them. | mikkelam wrote: | I really don't think the challenges we're giving at still | hard for computers.. a lot of these are super simple.. google | would've cracked many of the driving ones years ago | 0xff00ffee wrote: | Did you even RTFA and look at hCAPTCHA? hCAPTCHA couldn't be | more grossly focused on neural-net training. Hell, one | challenge asks you to draw a bounding box and another is a | classification tagging. | Nicksil wrote: | There was no argument being made for HCAPTCHA in the post | to which you replied. So, yeah, everything you mentioned is | indeed gross, including Google's behavior. | 0xff00ffee wrote: | The parent post was edited. | Analemma_ wrote: | This really shows how popular perceptions of Google have | changed for the worse over the years. I remember when | RECAPTCHA was first launched, everyone knew right away that | it was just helping Google train their vision models, but at | the time we all thought it was cool, like "Wow, I'm helping | the cause of AI research at the same time as stopping spam". | But now it just pisses everyone off. | | Hell, for a little while Google had a game (can't remember | the name of it) which was labeling images with another person | to get points and people loved it. | machello13 wrote: | At least the original reCAPTCHA was used for OCR'ing public | domain books. Even if it had the effect of training | Google's OCR tech, it was at least making literature | searchable and indexable for the public good. Modern | reCAPTCHA is nothing more than training for Google Maps | and, seemingly, self-driving cars, both of which are | commercialized. | hombre_fatal wrote: | > But now it just pisses everyone off. | | Though we're still just talking about a few HNers here who | complain about doing "free work for Google", not the broad | population. | GuB-42 wrote: | > One of the more insidious elements of ReCAPTCHA is its | propensity to challenge users who have robust cookie blocking | in place. | | It is understandable and I expect HCAPTCHA to do the same | thing. The goal of a CAPTCHA is to identify you as a human. I | don't know how ReCAPTCHA works, but I expect it to be like spam | filters: they have a sample of bots, a sample of humans and | assign weights to every aspect, in the end, the algorithm spits | out a probability of you being human, and it will challenge you | until it reaches a set value. | | The thing is: if you hide everything for privacy reasons, you | are making yourself indistinguishable from anything else using | HTTP, including bots. That's the point, but it also means the | only way to prove you are human is through a challenge. | | Think of it like a private club. If you a regular and the | bouncer is likely to recognize you and let you in without | asking anything. But if you don't want to show your face, you | will need to show your membership card every single time. | That's the price of anonymity. | Kalium wrote: | One of the non-obvious consequences is that any system designed | to use technical measures to distinguish between humans and | computers will wind up very sensitive. There's an arms race, | and us real users are caught in the middle. | | There's a _vast_ army of computers doing their best to pretend | to be human. The whole point of any kind of CAPTCHA is to try | to catch them out - and every measure gets worse over time. So | companies like Google look at everything they can see that | helps them distinguish typical humans from robots. | | This has a nasty side-effect. A lot of measures intended to | preserve privacy have the incidental effect of making the | privacy-sensitive user look more like a computer and less like | a human. Not saving cookies and not executing JS are classic | bot moves. This plays directly into the sensitivity that has | been engineered over time in order to catch more computers | posing as humans. | | I don't know any easy resolution to this tension. Maybe you do? | I really hope so. The internet is overrun with abusive behavior | and the amount of work that goes into keeping it at bay is | staggering. | jjoonathan wrote: | That, and ReCAPTCHA had hellbans. | | If you blocked cookies or were otherwise problematic, it would | sometimes lock you out of all ReCAPTCHA-gated resources not by | giving you a message describing what was happening, why, and | how to fix it, but rather by simply pretending that your every | attempt to solve the captcha failed. Obviously this is | extremely frustrating, by design, but it gets even more so with | compounding factors like "the library is closed at this hour, | so I can't get a fresh connection." | | The worst I've seen has been when it happens to people who | aren't well equipped to guess what's happening. When my | friend's younger brother got hellbanned from his PlayStation | account, he spent 30 minutes trying to identify traffic lights | (or whatever) and then retreated crying to his room, because he | wasn't able to deduce that Google was gaslighting him. He | trusted Google. They had him convinced that he was such a | failure he couldn't even identify traffic lights correctly, and | he was -- quite reasonably -- inconsolable for a while. | | Thanks a lot, Google. | _jal wrote: | Captchas are fundamentally anti-human. I'm not saying there | isn't a problem to be solved, I'm saying Captchas are a | behavior enforcement mechanism overseen by robots and are | anti-human. | | I write the site owner short note when they go bad explaining | why they just lost a customer and go somewhere else. Life is | too short to put up with shitty tech. | Kalium wrote: | What, in your opinion, is the pro-human way to address the | problem to be solved? | | I'm always curious to hear what other approaches might be | worth considering. CAPTCHAs tend to tick the boxes of | performing well enough for website-controllers and being | low-effort for them to deploy. | jfengel wrote: | Blockchain, perhaps? | | A lot of CAPTCHAs protect things that are very cheap, but | where they don't want it to be free. One solution would | be to charge money, but people concerned about privacy | won't want to give away conventional payment information. | | So, perhaps a nominal payment in some reasonably | anonymous cryptocurrency? Or even just participating in | some proof-of-work problem that would cost a few cents | worth of electricity? | | That wouldn't stop really serious botnets or people with | stolen credit cards, but those are also both illegal and | should be shut down for other reasons. | jjoonathan wrote: | Less gaslighting. | | There's a lot of ground between "error messages precise | enough to effectively give botters a to-do list" and | "faking failures 100 times in a row." What was the | marginal utility of the 99th fakeout? Are there really | enough otherwise effective bots that get persistently | tripped up by this particular fakeout to justify sending | the poor kid crying to his room? | | Almost certainly not. What really happened is that | someone removed (or never added) user communication in | order to maximize their score against botters and gave | little thought to mitigating their false positives. | Minimizing them, yes, mitigating them, no. "Humans are | smart, they'll figure it out," they rationalized to | themselves, and called it a day. They never bothered to | calculate (or even guess) when the marginal utility of | the fakeout dropped far enough to allow them to have | mercy on the poor humans still caught in their web. | _jal wrote: | I have no suggestions for the general case, and suspect | it is one of those problems that doesn't have general- | purpose solution. That doesn't mean captchas don't suck. | | As for specific things one can do, like anything, more | effort means better results. I'm not going to talk about | this much, but we do look at a lot of different | behavioral and other signals for fraud detection, as | that's an important aspect of our business. | | If others are fine with annoying their customers to | offload risk, they can make that call. I don't have much | sympathy about lost sales, though - it is literally | choosing to waste customers' time and increase | frustration for one's own benefit. | quotemstr wrote: | You've made an assertion, not an argument. What does "anti- | human" even mean? You're angry, sure, but you haven't | expressed what exactly it is that you're angry about. Nor | have you proposed a realistic alternative way to | distinguish bots from humans. This kind of histrionic, | sweeping hot take is not productive. | karatestomp wrote: | Considering captchas operate by pushing the work of | avoiding bots on your site (your problem) onto _all the | human users of your site_ , I think on the basis of that | alone "anti-human" is warranted. Or "anti-social", if you | prefer, which might better capture the fundamental | problem with that aspect of it. That they proceed to | perform textbook gaslighting on some of those people | makes it even worse ("no, you _didn 't_ select all the | buses in those images" but, of course, you did). Whether | these things are necessary for it to operate is beside | the point. | RadiantUnicorn wrote: | I don't think I've ever been "hellbanned", but I've certainly | spent more than 5 minutes on trying to get a captcha to work. | | After a while I usually need to ask friends in the US to help | me, because it asks me a non-localized question. | | My favourite question was: Select all fire hydrants. | | I selected only the classic red one's you see in movies. | Fail. | | I selected the one's that were yellow too. Fail. | | I sent a picture of the grid to a friend. He spotted that | some of the pipes on a wall were fire hydrants, which I | didn't know. Pass. | | In my country we don't have hydrants. We have holes in the | ground that are covered by a lid. After removing it you can | attach the water hose there. | dheera wrote: | It is pretty straightforward to train a neural network to | solve these -- e.g. fire hydrants, traffic lights, cars. | | I would have thought ReCAPTCHA would take into account | human factors (e.g. speed of clicking) as higher priority | to the accuracy of the selection. | wyre wrote: | AFAIK it takes into account mouse movement and the speed | of clicks. | therein wrote: | In my experience, relatively easily defeated by `await | Promise.delay(randomDelay())` | dheera wrote: | Sounds like a cat and mouse game. | | Mouse: They could then try to analyze human delay | randomness -- it's probably not uniform. | | Cat: And then someone will come up with a replacement to | randomDelay that mimics the above pattern. | | Mouse: And then they will look for changes in the | distribution itself from person to person | | etc. | MertsA wrote: | I know back in the day for RuneScape bots using SCAR | there were macros to move the mouse from one position to | another on the screen with randomized acceleration, | randomized curvature, overshoot, clicking in some | bounding box, etc. all using a normal distribution in an | effort to thwart detection. Imagine being the poor | developer tasked with trying to recover some signal out | of that. | floatingatoll wrote: | Regularity of clicking is considered a sign of robot | behavior, which is especially frustrating if you learned | to perform repetitive image identification mouse tasks in | a computer with rhythmic regularity (think Turk, for | example). | jjoonathan wrote: | Yeah, I should break down my methodology for arriving at | the "hellban" conclusion. | | If I get a bunch of failures in a row, I'll first try the | refresh button built into the captcha, and then re-solve a | number of times. Then I'll try re-loading the page and re- | solving, then I'll try in a different browser with cleared | state and re-solving, then I'll try a different device and | re-solving, and finally I'll try a different connection, | device, and cleared browser state and re-solving. | | I'll consider something a hellban if I get persistent | failures across several different challenge types but | switching to a clean connection+device+state results in | immediate success with the captcha. | | Look, I get it, they can't be too explicit with the errors | or they tip their hand to the botters and effectively give | them a "to-do" list. Still, the gaslighting is persistent | enough that there's just no way it's marginally beneficial | all the way through. At some point, everyone figures it | out: bots, techies, and normies. My guess is that they | figure it out in this order, from quickest to slowest: | smart bots, techies, normies, dumb bots. I'm not calling | normies dumb here, they just don't have much background | knowledge about the inner workings of captchas, so it takes | longer. By that point, they're so far past the typical | number of captcha attempts that only the very dumbest of | bots, those without heuristics to detect this sort of | thing, are going to be fooled along with them. Surely | having the captcha tip its hand at this point -- which only | gives an advantage to the dumbest of bots, because the | smart bots figured it out long ago -- is the right thing to | do. | | Re:CAPTCHA has no mercy on the normies, and I really think | they could do a lot better. | 101404 wrote: | I see a million dollar lawsuit for discrimination >:-} | therein wrote: | Now imagine if that ReCAPTCHA was served on an equal | opportunity lender's website or on a job application | form. | Kalium wrote: | How, in your opinion, should Google have handled the matter | in a way that does not give spammers or other abusive users | ways to get around the measure? Bear in mind that any such | approach has to be scalable to many zeros daily, the vast | majority of which will not be _empathically awful_ cases like | your brother 's very real pain and distress - most will be | genuinely abusive behavior. | | I want to be clear that I am not attempting to minimize your | brother's pain or emotional suffering. I'm hoping that there | might be an approach that's kinder and more compassionate to | him while still accomplishing the same goals. | jjoonathan wrote: | > the vast majority of which will not be empathically awful | | Yeah, most of the time it's "just" really, really | obnoxious, not to mention coercive in a way that aligns | with Google's interests. | | Thanks, Google. | | > How, in your opinion, should Google have handled the | matter in a way that does not give spammers or other | abusive users ways to get around the measure? | | "Our anti-spam systems believe that you might be a robot. | Your profile has been locked for (x) minutes. Sorry for the | inconvenience. Go _here_ to learn tips & tricks for | avoiding lockouts in the future." X gets exponentially | ramped. | | Note how vague the message is. It sacrifices the | opportunity to tarpit a really dumb robot in exchange for | not being awful to humans. | | Based on ReCAPTCHA's design decisions, it's abundantly | clear that eeking out every sliver of a percent of marginal | efficacy is the priority over treating users humanely. | That's why I have a problem with ReCAPTCHA. | Kalium wrote: | In my opinion and experience, ReCAPTCHA isn't really, | really obnoxious most of the time. I suspect that most of | the time it trips up bots who have no emotional | experiences whatsoever. Most of my personal encounters | with it involve solving no puzzles whatsoever. With that | in mind, I expect humans and their completely real | reactions might not be the default case. Of course, this | is speculative, as I do not have any kind of special data | on the subject. | | Thank you for sharing! Have you considered the | possibility that presenting any message at all - | especially one with a clear block time - is sending a | very clear message to bot controllers? I'm sure you've | considered this, and I am just failing to understand. | Wouldn't that remove any real gains from being vague with | tips & tricks? | | Wouldn't there also be the real chance that vague tips & | tricks would leave an actual human being in tears, | convinced that they're just too dumb to understand them | properly? | close04 wrote: | > In my opinion and experience, ReCAPTCHA isn't really, | really obnoxious most of the time. | | The percentage of that time goes up as you move away from | Chrome and Google cookies. | Kalium wrote: | I don't think Chrome has ever been my daily driver. | | That said, I also expect to be treated with more | suspicion when I behave more like a bot. So I'm neither | surprised nor bothered when Firefox Private gets me an | uptick in ReCAPTCHAs. I understand that this is a highly | unusual expectation. | jjoonathan wrote: | > I suspect that most of the time it trips up bots who | have no emotional experiences whatsoever. | | I'll bite: maybe it's good at identifying obedient drones | and letting them through :) | | It trips up the normies in my life often enough that I | suspect being technically inclined is actually a net | advantage because it makes you quick to detect the | problem and quick to apply workarounds. Those advantages | are significant enough to outweigh even the cost of the | semi-regular dance where I try to protect myself and | Google jerks my chain. | | > Have you considered | | The fact that I phrased my proposal as a tradeoff should | have strongly hinted that I did, in fact, consider. | | > Wouldn't that remove any real gains from being vague | with tips & tricks? | | One bit of information -- locked vs not -- is hardly the | same as disclosing the inner workings, or even the | information inputs, of the classifier, and smart botters | have access to that bit of information anyway because | they've built a gaslight detector by leveraging their | legions of diverse bots and endless supply of dirt cheap | human labor. | | Gaslighting humans is really bad. A minimal courtesy | would only cost a sliver of efficacy, and ReCAPTCHA still | rejects it. That decision earns it the bad will directed | its way. | thewebcount wrote: | > In my opinion and experience, ReCAPTCHA isn't really, | really obnoxious most of the time. | | Do you use any sort of privacy protection while browsing? | I do a few simple things like browse in private mode by | default, and ReCAPTCHA just cannot deal with it. It | instantly brands my connections as a bot. It is | obnoxious. Using private mode shouldn't ban you from the | web. There's no reason that most web sites need to save | data on my computer to identify me later. | Kalium wrote: | That's an excellent question! I can, and do, routinely | use privacy protections when browsing. | | I have not found them to ban me from the web. I'm sorry | that has happened to you. | tcd wrote: | I would _love_ to see the raw data on how many transactions | have been abandoned because of ReCaptcha; if I had to solve a | test to purchase my shopping, I 'd go elsewhere (and there are | places that are not as hostile out there). | | I cannot understand the stupidity of putting your entire | business in the hands of an advertisement company who gives no | shits about you as a business or a person, apart from your | data. | | I can say for certain ReCaptcha has made me reconsider a | purchase and is a major factor in my purchasing decision. If I | can't use all my privacy tools (including noscript, and I only | whitelist a few times to get the right scripts), then I don't | care about what you're selling. | | Hopefully in the near future ReCaptcha breaks altogether due to | enhanced privacy protection. | blakesterz wrote: | > "Earlier this year, Google informed us that they were going to | begin charging for reCAPTCHA. That is entirely within their | right. Cloudflare, given our volume, no doubt imposed significant | costs on the reCAPTCHA service, even for Google." | | Even in the article they say... "Google provided reCAPTCHA for | free in exchange for data from the service being used to train | its visual identification systems." ... I thought this was one of | those win/win things... Google gets something, websites get | something... what's changed? Is Google not getting much out of | reCAPTCHA now? | peeters wrote: | In the article they also say: | | > Again, this is entirely rational for Google. If the value of | the image classification training did not exceed those costs, | it makes perfect sense for Google to ask for payment for the | service they provide. | | This might be exacerbated in the case of Cloudfare. Imagine a | system where 99% of the visitors being challenged are human. | The data gathered from such visitors is quiet, quality data. | That fits the usecase of validating an anonymous poster on some | random blog. Now consider the Cloudflare usecase. Visitors will | only be challenged when Cloudflare already expects you're a | bot. Most of the challenges are served to bots. The data is | much lower quality, but their cost per challenge has remained | the same. | | It could just be that as this type of usecase became dominant, | the balance of value tipped. | gurrone wrote: | I guess this is very true. Our quite elaborate Cloudflare | Firewall setup combining bot management scores with GeoIP and | network information to decide on the action has solve rates | below 0.5% on most rules. | | The only case where we see up to 3% solved is on rules | targeting networks which contain mostly free (as in beer) VPN | providers (the new pest of the internet). Those networks sent | a lot of malicious and automated traffic with the mixed in 3% | of real users. | | To put this into numbers of the past 24h: ~ 76 Million | requests served ~ 1 Million of those were captchas ~ 0.5 | Million were outright blocked Captchas solved: 1233 | Hello71 wrote: | my bet is that the bean counters have caught up with this | product, and it'll be run into the ground with excessive | pricing, because Google products have to make millions or | otherwise they'll be killed. most notably, Reader. | IAmEveryone wrote: | These complaints about Google "moving too fast" used to | really confuse me. I couldn't really spot a meaningful | difference in mean survival b/w Google products, start-ups | similar to individual Google products, and other businesses' | behaviour. | | But I've now attained zen-like clarity on the issue: the | complaints are coming only, and always were coming mostly, | from people whose idea of appropriate change over time is to | still complain about Google Reader almost a decade after it | happened. | raxxorrax wrote: | I think using captchas for image recognition was one of the | most ingenious strategies of the modern web. Don't think Google | is making the correct move here. | | Overall I would like to see these checks removed and Cloudflare | is using them quite excessively. | thewebcount wrote: | > Google provided reCAPTCHA for free in exchange for data from | the service being used to train its visual identification | systems. | | Has this been true lately? Every time I see it, it gives me the | same images from a set of 3. 90% of the time it's classifying | street lights, and it's the same street lights every time. | About 7% of the time, it's pictures with cars in them, and | again, it's the same pictures most times (but in a different | order, I think). The remaining times it's fire hydrants or | store fronts, often in a language I can't read, so I don't know | if it's a store or not. (And again - mostly the same images | each time.) | oefrha wrote: | Seeing that reCAPTCHA v3 doesn't use endless streams of images | any more, I would guess that Google no longer benefits much | from having users tag storefronts, traffic lights, buses or | fire hydrants. Maybe their image recognition algorithm is past | that stage. | robin_reala wrote: | It does as a fallback. But you're missing the main point of | v3, which is that it shifts the legal onus of blocking from | Google to the integrating site. No longer can Google be sued | for accessibility violations, if it's the site that's | stopping the user from entering purely on a suggestion from | Google. | dathinab wrote: | Just because you do some technically workarounds doesn't | mean you get a legal free pass. | | I don't think this aspect did matter much because it was | always the sites decision to use reCAPTCHA and that didn't | change. | | I also don't think Google gets much profit out of the image | tagging part anymore, they already have a huge database of | tagged images. | crazygringo wrote: | Pure speculation, but at some point your dataset is large | enough. | | The original reCAPTCHA corrected errors in scanned books | published decades/centuries ago. At some point, they're all | fixed. | | Similarly, more recent images have all been of traffic images. | And they probably have way more than enough now -- at least of | the type that can be done by reCAPTCHA. | | So unless Google comes up with a new mass-categorization | problem easy enough for literally everyone to do and simple and | small enough to fit in a reCAPTCHA... then they charge. | dx034 wrote: | It's probably a question of size. Same as with Google | Analytics. Google can afford to offer it free of charge for | smaller websites but charges for larger ones. Cloudflare was | probably one of the heaviest users with a very high percentage | of bots (as they're good in pre-filtering). | maallooc wrote: | I hope this captcha is tor friendly. ___________________________________________________________________ (page generated 2020-04-08 23:00 UTC)