[HN Gopher] Block web scanners with ipset and iptables
       ___________________________________________________________________
        
       Block web scanners with ipset and iptables
        
       Author : yabones
       Score  : 29 points
       Date   : 2022-11-10 20:40 UTC (2 hours ago)
        
 (HTM) web link (nbailey.ca)
 (TXT) w3m dump (nbailey.ca)
        
       | creeble wrote:
       | I understand the first part -- sending requests with no host
       | header to a spam log (or even better, don't log).
       | 
       | What I don't understand is the second part -- blocking those
       | hosts. Seems pointless now that you've de-noised your logs.
       | They're still sending packets. Saves thousands of bytes on
       | outbound?
       | 
       | What about all the scan-spam on sites WITH host headers? Whatevs.
        
         | yabones wrote:
         | Serves a few purposes, but as you said the main objective is
         | already done by de-noising. The other reason I do this is
         | because it's easy to detect that kind of scanning in HTTP logs,
         | but not as easy for other services (ssh, ftp, smtpd, etc)
         | without something like fail2ban, and the blanket ban applies to
         | all of them. So, if a bot scans your HTTP server enough times,
         | they can't go after "softer" targets later.
         | 
         | For scan-spam that does hit your "real" site, it's a bit more
         | tricky as there absolutely will be false positives. You can
         | grep for all 401/403's and add them to the list, but that will
         | sooner or later hit a real user. So it's much more specific to
         | the application you're hosting, where this works for just about
         | any site. The other nice thing is that even when they scan your
         | "real" site, they'll often hit the default host via IP scans at
         | the same time, so you can still manage to ban them.
         | 
         | It's not perfect, but it's good enough :)
        
           | dpifke wrote:
           | Running your own mail server is a _great_ source of data to
           | identify botnet-compromised hosts.
           | 
           | When I started banning IPs that send "HELO <myhostname>" for
           | 24 hours, I cut the number of fake login/registration
           | attempts on a bunch of my web-based projects by ~50%.
           | 
           | It works the other way, too. Temporary bans on hosts that try
           | to access /wp-admin (I don't run Wordpress anywhere) cut my
           | email spam significantly.
           | 
           | (Some day, I'll get around to implementing a real reputation
           | tracking system, with exponential ban lengths.)
        
             | bombcar wrote:
             | This is an important aspect of it - you can use information
             | on one angle of attack to protect other devices.
             | 
             | Do note that doing this kind of thing can block people on
             | Tor, because Tor is used for attacks quite often, also.
        
       | holoduke wrote:
       | Why not using the way easier to configure i(f)tables? It's so
       | much more straightforward and flexible.
        
       | hoppla wrote:
       | Another neat trick is to add a link in robots.txt and instruct
       | bots to stay away. If they don't, you add them to your blocklist
        
         | justin_oaks wrote:
         | I was confused at what you were saying at first. For those that
         | may also be confused:
         | 
         | You can add something like this to your robots.txt:
         | User-agent: *         Disallow: /some/unguessible/url
         | 
         | And then you ban any IPs/bots that visit that URL.
        
       | wooptoo wrote:
       | Actually you don't need to respond to bogus http clients at all:
       | https://gist.github.com/radupotop/2aef0bdc0ccbd3a706044e3598...
        
       | justin_oaks wrote:
       | I wonder why the author uses a 404 error response. I usually
       | configure NGINX with "return 444;" which closes the connection
       | without response. Scanners don't deserve a response. I may have
       | wasted bytes receiving the request, but I won't waste any more
       | once I know the request is garbage.
        
         | yabones wrote:
         | That was mostly just for the blog post. In reality my default
         | vhost 301's back to the IP that sent the request. I doubt it
         | ever does anything, but I like to think it makes hackers attack
         | themselves in the confusion :p
         | 
         | I also have a fake /admin path that just contains a bunch of
         | offensive/illegal phrases in 10 ish languages, but it was out
         | of character for the post.
         | 
         | 444 is a good idea though, I didn't know about that response
         | code!
        
       | alyandon wrote:
       | On my internet facing hosts, I use the firehol level 2 and level
       | 3 block sets along with blocking all CN IP space that I can
       | accurately identify. My logs are eerily quiet.
        
       | thrwawy74 wrote:
       | https://wiki.nftables.org/wiki-nftables/index.php/Moving_fro...
        
       ___________________________________________________________________
       (page generated 2022-11-10 23:00 UTC)