[HN Gopher] Fighting TLS Fingerprinting with Node.js
       ___________________________________________________________________
        
       Fighting TLS Fingerprinting with Node.js
        
       Author : pimterry
       Score  : 80 points
       Date   : 2021-12-07 14:13 UTC (8 hours ago)
        
 (HTM) web link (httptoolkit.tech)
 (TXT) w3m dump (httptoolkit.tech)
        
       | mrlucax wrote:
       | Some time ago I tried to scrap some posts from the Cloudflare
       | Blog (https://blog.cloudflare.com/) with Node.js and got 503. I
       | was hoping I could use those tips from OP, but no luck. Maybe
       | they're using some other type Id.
        
       | SavantIdiot wrote:
       | That permutation list is too comprehensive. In reality, the
       | ciphersuite that is chosen is far more limited based on current
       | best practices. For example, up until a few years ago, SHA1 was
       | offered as an HMAC primitive, but no one ever used it.
       | 
       | It is far more likely in practice that 3~5 RSA or ECC base PKI
       | suites would comprise the majority of HTTPS sessions.
       | 
       | EDIT: I found a 2018 survey of 1,000,000 websites and these were
       | the top 10 ciphersuites, far from the quadrillion combos the OP
       | computed.                 ECDHE-RSA-AES256-GCM-SHA384    147985
       | ECDHE-RSA-AES128-GCM-SHA256    127964       ECDHE-ECDSA-
       | AES128-GCM-SHA256  41043       ECDHE-RSA-AES256-SHA384
       | 15400       DHE-RSA-AES256-GCM-SHA384      4326       ECDHE-RSA-
       | AES256-SHA           3231       DHE-RSA-AES256-SHA
       | 2484       0000                           2194       AES256-SHA
       | 2113       AES128-SHA                     1855
       | 
       | https://scotthelme.co.uk/alexa-top-1-million-analysis-februa...
        
         | pimterry wrote:
         | The cipher list here is the full list _offered_ by the client,
         | this isn't about the final single cipher that's selected and
         | used.
         | 
         | Yes, most aren't used for many real sessions, but they're all
         | still supported and available for fingerprinting. The selected
         | cipher isn't relevant.
         | 
         | The quadrillion total meanwhile it the total number of
         | _permutations_ not the number of ciphers.
         | 
         | In local testing, every single client (e.g. curl, Firefox,
         | chrome) I've seen sends between 15 to 35 options in every
         | client hello, which already gives you a huge number of possible
         | permutations, though you couldn't really reorder them
         | completely freely without real security implications.
         | 
         | As in the article, the extensions are more interesting for
         | increasing permutations though, since order there really is
         | totally arbitrary.
        
       | 0xbkt wrote:
       | Wow, that's interesting. Does a network stack also have a
       | characteristic that would factor in the fingerprinting of a
       | client? TTL, maybe? And hardware (e.g. network card) too.
        
         | jeroenhd wrote:
         | There are several factors that can be fingerprinted passively.
         | Timings, optional flags across the protocol stack, you name it.
         | Detecting specific versions of an OS can be more difficult,
         | especially passively, but you should be able to do it if you
         | have enough data.
         | 
         | A particularly useful anti-bot feature is to fingerprint TLS
         | connections by things like cipher order and available
         | signatures (if you're willing to switch back to TLS 1.2). This
         | way, you can easily detect the difference between browsers and
         | bots, even if all the request headers match . If you're fine
         | with blocking people behind middleboxes or proxies then it can
         | be a simple yet useful fingerprinting technique.
        
         | 0x0 wrote:
         | Yeah nmap can often be used to fingerprint OS versions for
         | those reasons.
         | 
         | Way back in the days around y2k many network stacks had poor or
         | no randomization of some parameters that produced some neat
         | looking plots: https://lcamtuf.coredump.cx/newtcp/
        
           | ape4 wrote:
           | Non an expert... are the ciphers in order of preference? So
           | they can't really be randomized, I suppose.
        
             | IggleSniggle wrote:
             | The ciphers are in order of preference. The blog post
             | hinted at this (I'm not sure why it didn't make it the
             | primary suggestion), but you may be able to get away with
             | simply duplicating a cipher.
             | 
             | After reading the post, that's what I'm going to play with
             | next time I run into this. I suspect duplicating will be
             | fine in almost all environments, and by picking one way way
             | down in the preference list to duplicate, I suspect you can
             | increase the chances that there's nothing in the chain that
             | will even have an opportunity to error based on the
             | duplicated cipher preference. I tried this against the
             | website quoted in the article and it worked without
             | problem.
        
               | pimterry wrote:
               | I'm the author - I didn't pick duplicates as the main
               | suggestion partly because it's not clear if all TLS
               | servers will accept that, but also because it's trivial
               | for TLS fingerprinters to detect and defeat that
               | automatically if they chose too. They just need to always
               | strip subsequent duplicates from the cipher list before
               | hashing, and the whole technique becomes useless.
               | 
               | I doubt that's happening today, but I suspect it will
               | very quickly if duplicate ciphers become commonplace.
               | 
               | Actually reordering ciphers meanwhile definitively
               | changes the hash, so there's nothing they can do in the
               | general case at least, and there's a reasonably large
               | number of acceptable non-duplicate orderings available to
               | choose from.
               | 
               | Duplicating ciphers is a good technique that's definitely
               | worth investigating too, but it has its limitations, and
               | it's not a silver bullet.
        
         | benmmurphy wrote:
         | it is possible to fingerprint the TCP stack of the client from
         | their first SYN packet (and possibly other packets?)
         | 
         | the only online demos of this i know of are:
         | http://witch.valdikss.org.ru/ and https://bot.incolumitas.com
         | 
         | i believe the WITCH site is based on p0f
         | (https://lcamtuf.coredump.cx/p0f3/)
         | 
         | i know some people have custom patches to the linux kernel to
         | make it look like a different operating system in order to work
         | around bot detection.
        
       ___________________________________________________________________
       (page generated 2021-12-07 23:02 UTC)