[HN Gopher] NopeCHA: Captcha Solver
       ___________________________________________________________________
        
       NopeCHA: Captcha Solver
        
       Author : zuhayeer
       Score  : 78 points
       Date   : 2022-11-27 22:10 UTC (6 hours ago)
        
 (HTM) web link (chrome.google.com)
 (TXT) w3m dump (chrome.google.com)
        
       | layer8 wrote:
       | Great, now captchas are going to get even more annoying.
        
       | version_five wrote:
       | Re-captcha is about google exercising monopoly power to try and
       | force you to use their browser and let them track you. It has
       | little to do with finding stopsigns or whatever. It would be cool
       | to see that problem addressed, i.e. allow me to use the internet
       | normally without a browser and privacy settings that google
       | endorses.
       | 
       | Incidentally, in almost all cases, if I'm faced with a recaptcha,
       | I just don't do the thing. I have foregone purchases and charity
       | donations, and not used products, because organizations care so
       | little about their customers that they think making us solve a
       | puzzle before we give them money is acceptable.
        
         | lagt_t wrote:
         | Isn't recaptcha browser agnostic?
        
         | crakenzak wrote:
         | Seems like forgoing charity donations over something like a
         | recaptcha is an incredibly bizarre hill to die on.
         | 
         | The main purpose of recaptcha is to prevent bots from abusing
         | services, and has much less to do with exercising "monopoly
         | power to ... track you".
        
           | listenallyall wrote:
           | > The main purpose of recaptcha is to prevent bots from
           | abusing services
           | 
           | It's funny when people think they are adding to the
           | conversation by contradicting a thoughtful and interesting
           | comment (even if it may be a bit conspiracy-theory-ish), by
           | simply re-reciting the corporate line.
        
             | version_five wrote:
             | I call that sort of reply "coin operated" - there is some
             | crude pattern recognition that goes on (dropping any
             | subtlety or new information a parent comment may be
             | providing) and a sort of pre-recorded viewpoint gets spit
             | out. There are certain topics where it's very common
        
               | listenallyall wrote:
               | I think it also happens sometimes when the parent comment
               | hits close to home... perhaps someone actually worked on
               | implementing a feature for a few years and even from the
               | inside never figured out the _actual_ purpose of what was
               | being built. That cognitive dissonance hits hard when an
               | outsider points it out in black and white.  "Wait, _I_
               | built non-consensual tracking software? But nobody told
               | me they would use it for that!!! "
        
             | jtbayly wrote:
             | I downvoted because I run several non-profit websites, and
             | start without any captcha's by default. But forms always
             | end up getting spammed, and then I have to add some
             | protection. This is literally why captchas exist in the
             | first place, as well as why so many sites have them. Google
             | came along and offered a convenient, free version, so many
             | people started to use it, although I don't. Why Google
             | decided to offer it is possibly what you claim, but that
             | changes nothing about why the nonprofit makes use of it.
        
               | version_five wrote:
               | > free
               | 
               | It's not free. It increases friction, and at least in my
               | case, results in abandoned transactions. I'm not well
               | versed in the different options for spam protection (or
               | the attacks) but I do know that most merchants don't make
               | their users solve a puzzle, especially at a critical
               | point along the purchase workflow where is it most likely
               | to get derailed.
               | 
               | The fact that google is (probably unintentionally)
               | particularly appealing to small providers or charities,
               | pretending they offer a "free" product, makes it even
               | worse.
               | 
               | Edit: not an endorsement, but elsewhere in the discussion
               | someone posted a link to cloudflare's captcha solution,
               | which they say specifically addresses the privacy and
               | annoyingness concerns of Google's captcha. So there are
               | options: https://www.cloudflare.com/en-
               | ca/products/turnstile/ (I'm not actually familiar with
               | this, it may have a downside I don't know about)
               | 
               | (Also, disagreeing with something is generally a poor
               | reason to downvote. It's much better to have a
               | discussion, and I appreciate your comment)
        
               | jtbayly wrote:
               | > It's not free.
               | 
               | True enough. That's why I don't use it. Google's solution
               | is absolutely awful, and I'm positive you aren't the only
               | one abandoning important flows on non-profit websites
               | because of it.
        
               | ajsnigrutin wrote:
               | Anything that wastes users time is not really free.
        
       | TylerE wrote:
       | Sigh. And so the next battle in the war.
       | 
       | There is no way this doesn't get abused, including, probably by
       | the Company making it.
       | 
       | So, I'm dreading recaptcha v4
       | 
       | We may already be passing the captch event horizon where machines
       | actually outperform humans on the damn things.
        
         | meta2023 wrote:
         | abused? it's a captcha solver. I'd argue abuse (from the
         | perspective of the target website/app) is the primary business
         | case.
        
         | wolpoli wrote:
         | Computers are getting better than human at solving these
         | challenges. So recaptcha v4 might end up being a micropayment
         | system since humans still have more money than bots.
        
           | eli wrote:
           | Several decades of experience suggests micropayments ain't
           | it.
        
           | [deleted]
        
           | charcircuit wrote:
           | >humans still have more money than bots
           | 
           | Simultaneously humans can be less likely to want to pay than
           | bots which can skew the bot to human ratio.
        
         | bogomipz wrote:
         | >"There is no way this doesn't get abused, including, probably
         | by the Company making it."
         | 
         | How would the company making it abuse this? I feel like maybe
         | I'm missing something obvious.
        
           | TylerE wrote:
           | Sell bulk captcha solving to bad actors.
           | 
           | The extension probably has a hidden limit of 50 solves a day
           | or something
        
         | vbezhenar wrote:
         | In my opinion next gen captcha should be asking user to prove
         | that he's human.
         | 
         | For example ask him to upload his video with his ID. This video
         | will be verified by another human operator.
         | 
         | In the end, user will be given some kind of identifier. He
         | should present that identifier to anyone asking if he's a
         | robot.
         | 
         | Of course that kind of verification will be paid. So you're
         | paying $100 to get a verified identifier and then you keep that
         | identifier (probably in the form of private key with signed
         | public key).
         | 
         | There will be multiple certificate authorities who will issue
         | those certificates to people. Rest of software companies will
         | trust those authorities.
         | 
         | You need to renew that certificate every year.
         | 
         | If someone spotted your certificate being used in a nefarious
         | schemes, your certificate will be revoked and you'll need to
         | pay $5000 fine next time you'll ask for new certificate.
         | 
         | If you don't possess certificate, you're not qualified to be a
         | human.
        
           | CasualSuperman wrote:
           | And thus poor people became unable to access the internet
        
           | EMIRELADERO wrote:
           | But then you're imposing an expensive yearly tax on people to
           | use basic services. Very poor people use the internet too!
        
         | bawolff wrote:
         | For pay captcha solvers have existed forever (e.g.
         | 2captcha.com) and the world hasnt ended yet. I doubt this will
         | change that much.
        
         | fratlas wrote:
         | Maybe the solution is to look for problems where human's
         | imperfections are identifiable
        
           | TylerE wrote:
           | If a machine can identify it, a machine can fake it.
           | 
           | Maybe given a large enough input, but do you want to spend 10
           | minutes solving a captcha?
        
       | svnpenn wrote:
       | one thing that annoys me is they dont ask you HOW MANY boxes to
       | check. So you dont know if you need to be "conservative" or
       | "aggressive". So I started just clicking a single box, then if it
       | prompts me I will keep adding one until I meet the requirement. I
       | think sometimes its just one or two.
       | 
       | However some shitheads like Discord also wont tell you how many,
       | and will also outright fail you if you click too few, forcing you
       | to restart the whole multi-test process. So fuck all of it. I
       | fully support this extension, they deserve what they get. They
       | need to figure out how to make it hard to fake, without making it
       | a nightmare for legitimate users.
        
         | vbezhenar wrote:
         | I think that you need to have a behaviour similar to other
         | humans. So I'm trying to think what squared would select some
         | kind of ordinary human who want to get it done as soon as
         | possible. Being very careful might actually work backwards.
        
         | dankwizard wrote:
         | But it kind of does - Click ALL of the Lions.
         | 
         | If there is a lion in the box.... Click it.
        
           | Blue111 wrote:
           | but what if a car takes 6 squares and two of the squares have
           | a minute amount of car in them... do you need to click it?
        
             | johntash wrote:
             | Usually yes, at least that's how I interpret it. So far, I
             | have not been identified as a bot.
        
           | svnpenn wrote:
           | this is obviously wrong, for reason I already gave. many
           | times you can click one or two boxes, even though more
           | "correct" boxes might exist. I dont want to click more than
           | needed, thats wasted time. Although to be fair my method is
           | probably slower overall.
        
       | freitasm wrote:
       | From the submitted link we can find the homepage for this
       | extension. You will then find that you can use the service over
       | an API and a pricing page ($4.99/2K daily recognitions,
       | $19.99/20K daily recognitions).
       | 
       | I would say this is useful for spammers and snipper bots.
        
       | odo1242 wrote:
       | I would argue that ReCAPTCHA's still work, at least to some
       | extent. Spamming a form is much easier to do when you don't have
       | to spin up an entire virtual browser to fill out those form while
       | also paying for the GPU computer necessary to run this ML model.
       | Plus, "click farms" for solving captchas have always existed, at
       | cents per solve.
       | 
       | Plus, ReCAPTCHAv3 makes this entire attack irrelevant by making
       | image classification not a part of the CAPTCHA.
        
         | fastball wrote:
         | This extension actually claims to side-step v3 as well.
        
           | TheCycoONE wrote:
           | I have seen recaptcha v3 bypassed with seemingly little
           | effort by financially motivated spammers. I have also seen
           | them spin up large numbers of Gmail accounts for email
           | verification. I'm curious what people have tried that
           | actually worked.
        
             | fastball wrote:
             | Can't speak for anyone else but we recently implemented V3
             | with V2 as a fallback entirely to help mitigate DDoS
             | attacks. Haven't been hit with another one yet but I have a
             | feeling it will be sufficient.
        
       | MontyCarloHall wrote:
       | The original reCAPTCHA served the dual purpose of fighting spam
       | and training optical character recognition algorithms. It
       | displayed a pair of words, one of which was unambiguously
       | resolved by an OCR and the other of which OCRs couldn't easily
       | read. The first word was used to disambiguate humans from bots,
       | and the second word was used to train the OCR.
       | 
       | Today, CAPTCHAs serve a similar purpose, except they're used to
       | train self-driving cars' image recognition AIs. I always try to
       | be a little subversive and correctly identify the images that are
       | clearly unambiguously classified by the AI, and then purposefully
       | screw up identifying the image that the AI struggles with. It
       | lets me through the majority of the time, which indicates that my
       | bad input made it into their training data.
       | 
       | Unlike the CAPTCHAs of yore, when machine vision simply was not
       | advanced enough to solve them, anyone has access to pre-trained
       | vision models easily capable of identifying the unambiguously
       | resolved buses or crosswalks in the CAPTCHA image. The deterrent
       | to spammers is no longer that actual humans need to solve the
       | CAPTCHA, but rather that it's too computationally expensive to
       | solve them at scale. Today's CAPTCHAs are basically Hashcash
       | proof-of-work [0], but with the added benefit to Google et al.
       | (and annoyance to users) that they help train computer vision
       | models.
       | 
       | [0] https://en.m.wikipedia.org/wiki/Hashcash
        
         | greesil wrote:
         | Very insightful. You forgot to mention "and is solvable by a
         | human". Otherwise the captcha would be just be proof of work of
         | some kind.
        
           | MontyCarloHall wrote:
           | Proof-of-work is exactly what Cloudflare's Turnstyle CAPTCHA
           | alternative is: https://blog.cloudflare.com/turnstile-
           | private-captcha-altern...
           | 
           | The only reason we still solve those stupid image recognition
           | puzzles is because Google/Waymo and other self-driving car
           | companies have managed to trick us into helping them do their
           | training work for them.
        
         | porphyra wrote:
         | Why do you want to screw up the training data though? You have
         | nothing to gain while making life a little harder for everyone.
        
           | MontyCarloHall wrote:
           | The cynic in me says because I resent being forced to help
           | multi-billion dollar companies crowdsource their AI training.
           | 
           | The techno-optimist in me says because I want to force them
           | to improve their underlying models. When their engineers
           | notice that their model struggles with weird edge cases that
           | I purposefully mislabel (e.g. when prompted to select images
           | containing motorcycles, I also pick a mountain bike with fat,
           | motorcycle-sized tires), perhaps they will contemplate how to
           | rigorously encode the concepts of "motorcycle" and "mountain
           | bike" into their model, rather than simply pushing an
           | abundance of training data through a black box classifier and
           | hoping that by adding more crowdsourced data, it will
           | eventually arrive at the right answer.
        
           | jrm4 wrote:
           | Not if you believe that the people working on this are going
           | too fast and/or have a misguided goal.
           | 
           | I think it's _reasonable_ to believe that real self-driving
           | cars are not inevitable, or even if they are, deliberate
           | disruption of this process is healthy; e.g. it shouldn 't
           | rely on something this dumb.
        
             | RockRobotRock wrote:
             | Do you really think reCaptcha data only benefits Waymo?
             | What about Google Maps detecting stop lights? Or wheelchair
             | ramps?
        
       | Sprite_tm wrote:
       | Was wondering how these people make money... looks like you can
       | buy 'enterprise plans' where you can have them solve captchas en-
       | masse... Not sure if I agree with whatever people want to make
       | use of that.
        
       | dendav_rai wrote:
       | I cannot for the life of me figure out how this magic works. They
       | claim Deep learning. If someone has some relevant material,
       | please suggest them. Thank you!
        
       | system2 wrote:
       | I am actually surprised to see this extension existing and listed
       | on the extensions page. I bet google will remove this very soon.
       | Unlike adblocks, this is threatening google's security claims.
        
         | judge2020 wrote:
         | > Featured
         | 
         | > Follows recommended practices for Chrome extensions. Learn
         | more
        
           | system2 wrote:
           | Featured until removed. Follows recommended practices at
           | first look until checking and finding out it is not Google
           | TOS compliant.
        
       | kevmo314 wrote:
       | I know a little bit about this "industry". I would be pretty
       | surprised if this actually done by AI. At least if it is, it's
       | likely only AI-assisted. If it were truly AI, then they would
       | make more money offering their own CAPTCHA service instead of a
       | CAPTCHA-breaking service. You can see how many active workers (ie
       | humans) are online on their network stats screen:
       | https://nopecha.com/statistics_network
       | 
       | Typically, the API is a screen recorder and the CAPTCHA is sent
       | to thousands of workers who essentially mini-remote-desktop in
       | and solve them for about 80 cents/1k CAPTCHAs. Here are some
       | other, similar services: https://0captcha.com/,
       | http://bypasscaptcha.com/, https://deathbycaptcha.com/
       | 
       | I'm surprised these players are still around. They've been
       | operating for nearly 20 years back when I had discovered them.
       | 
       | The entire industry is actually not completely as black hat as
       | you might think. Yes, it's used for spam and botting, but at
       | least at the time a lot of people used it for bulk downloading,
       | which is how I discovered it. Additionally, it does provide work
       | for the poorer parts of the world.
        
         | kijin wrote:
         | If I were to enter the CAPTCHA-breaking business today, I'd
         | probably use one of these services at first to collect a
         | million correct solutions for $800, and then use that dataset
         | to train my AI.
         | 
         | Once the AI is good enough, I can buy a bunch of used GPUs from
         | former ethereum miners, throw them in a cheap DC somewhere, and
         | undercut everyone else! Sounds like a decent side project that
         | could yield a bit of passive income. Somebody else has probably
         | done it already. Maybe OP is that somebody.
        
           | nerdponx wrote:
           | This works fine until Google changes the image set. Of course
           | then you can pay another $800, but then your product doesn't
           | work until you update.
        
             | jdironman wrote:
             | I wonder if stable diffusion / dall-e type offerings could
             | procedurally generate images?
        
             | slothsarecool wrote:
             | This is what hCaptcha is currently doing, they are
             | switching the image category every 24-72 hours. How useful
             | is it? Not very. Modern ML models such as mobilenet, resnet
             | or yolo require only a few hundred images for it to be
             | accurate to solve those captchas.
             | 
             | You don't need few million samples, with 500-700 images per
             | category you are more than ready to solve current captchas.
        
               | kijin wrote:
               | Yep, the cost of keeping the model up to date would be
               | negligible compared to the hosting bill.
        
         | EMIRELADERO wrote:
         | This does seem to use AI or at least not use the "human
         | workers" method.
         | 
         | Going to the Google SSO page for their signin flow and clicking
         | on the blue domain name for their app, the Google auth page
         | shows the email of the GCP account that started the auth
         | project, which in this case is jaewany@gmail.com
         | 
         | Looking that up on Google shows that it corresponds to Jaewan
         | Yun.
         | 
         | Looking him up on GitHub gives you his profile which contains
         | some captcha solver extension code for this very website and
         | also many TensorFlow-related things.
         | 
         | His personal website[1] also lists the solver under "My
         | Products"
         | 
         | [1] https://jaewan-yun.com/
        
         | 005 wrote:
         | As someone with experience using services like these, and at
         | the price point and solve speed their offering its quite clear
         | that is a model. Legacy players using low paid humans had solve
         | speeds >20 seconds usually and now model based solvers are now
         | down to under a second.
        
       | slothsarecool wrote:
       | Ever since ML has reached the "general public", developing models
       | against hearing or vision based CAPTCHAS has become trivial.
       | 
       | Sure, you have to emulate or simulate the client JS challenges
       | but when bots are running browsers in the background you can only
       | do so much.
       | 
       | I wonder what the future of captchas, if any, will look like.
        
         | judge2020 wrote:
         | It's identity, which is why Google shows "Your computer or
         | network may be sending automated queries" message on recaptcha
         | if you trigger too many heuristic and IP reputation signals to
         | be classified as a bot. That's why, for Google, you get to
         | carry around your reputation in the form of your Google
         | Account, and for Cloudflare, they have private access tokens[0]
         | (which might be the only reason you don't get blocked by every
         | CF site on iCloud Private Relay), and otherwise Cloudflare's
         | big ambition is "human attestation" via WebAuthn
         | credentials[1,2].
         | 
         | 0: https://blog.cloudflare.com/eliminating-captchas-on-
         | iphones-...
         | 
         | 1: https://cloudflarechallenge.com/
         | 
         | 2: https://blog.cloudflare.com/introducing-cryptographic-
         | attest...
        
           | ajsnigrutin wrote:
           | ...which really sucks when you try to use any of those sites
           | via tor (no cookies, "bad" IP) or at a place with a shared
           | external IP (public access points).
           | 
           | Open google.. captcha... every page has a 5 second cloudflare
           | page before opening the page itself.
           | 
           | Bots have the time, they can wait and do other stuff in the
           | meantime, but we, humans get bothered by that.
        
           | slothsarecool wrote:
           | However, that's not a solution but a patch.
           | 
           | Google accounts give you a good score and tend to deliver
           | easy captchas while dealing with Recaptcha; however, for this
           | reason, google accounts are being sold and bought constantly.
           | 
           | People have tried similar fight tactics in the past. SMS and
           | phone verification have failed because the return on
           | investment is far greater than the price barrier it adds to
           | get any of those "virtual identities".
           | 
           | iPhones might work but then, for how long? If you guarantee
           | that an IPhone won't get captchas, it's a good investment to
           | buy many old(or new) ones and sell token access to skip any
           | captcha.
           | 
           | Many farms already have thousands of phones scrolling through
           | youtube videos to get views, likes, and other stats for
           | videos/channels.
           | 
           | The same "logic" applies to yubikeys and similar auth
           | hardware; attackers can exploit it similarly.
           | 
           | Companies will tell you that they have abuse policies and
           | actively fight abuse/bot farms, but again, they are not
           | solving a problem but solving the problem with tape.
           | 
           | ReCAPTCHA was very useful for a while, it did genuinely stop
           | bots reasonably well, but none of the "newer" versions seem
           | as efficient as the older versions used to be. Progress
           | stopped after V2.
        
       | throwup wrote:
       | Very cool, thanks for submitting this. I use Buster[1] but I've
       | always been annoyed it doesn't support hCaptcha (used by
       | Cloudflare). I'm excited to try this out!
       | 
       | [1]: https://addons.mozilla.org/en-US/firefox/addon/buster-
       | captch...
        
         | Acen wrote:
         | Cloudflare have their own technology that they're using pretty
         | heavily now, turnstile.
         | 
         | https://www.cloudflare.com/products/turnstile/
         | 
         | Don't get any goofy puzzles which is nice.
        
       ___________________________________________________________________
       (page generated 2022-11-28 05:00 UTC)