* * * * * Yet more observations about the MJ12Bot I received a reply [1] about MJ12Bot [2]! Let's see … > From: Majestic > To: Sean Conner > Subject: [Majestic] Re: Your robot is making bogus requests to my webserver > Date: Thu, 11 Jul 2019 08:34:13 +0000 > > ##- Please type your reply above this line -## > Oh … really? Sigh. Anyway, the only questionable bit in the email was this line: > The prefix // in a link of course refers to the same site as the current > page, over the same protocol, so this is why these URL (Universal Resource > Locator)s are being requested back from your server. > which is … somewhat correct. It does mean “use the same protocol” but the double slash denotes a “network path reference” (RFC (Request For Comments)- 3986 [3], section 4.2) where, at a minimum, a hostname is required. If this is just a misunderstanding on the developers' part, it could explain the behavior I'm seeing. And speaking of behavior, I decided to check the logs (again, using last month) one last time for two reports. Table: User Agents, sorted by most requests, for June 2019 404 (not found) 200 (okay) Total requests User agent ------------------------------ 170 42676 46334 The Knowledge AI 21 36088 38097 Mozilla/5.0 (compatible; SemrushBot/3~bl; +http://www.semrush.com/bot.html) 46 16633 17130 Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/) 5 15840 15928 Mozilla/5.0 (compatible; AhrefsBot/6.1; +http://ahrefs.com/robot/) 3 12304 12353 Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) 36 8412 8929 Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler) 7 8428 8908 Gigabot 5680 2015 7872 Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/) 28 6604 6942 Barkrowler/0.9 (+http://www.exensa.com/crawl) 0 4705 4737 istellabot/t.1.13 Table: User Agents, sorted by most bad requests (404), for June 2019 404 (not found) 200 (okay) Total requests User agent ------------------------------ 5680 2015 7872 Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/) 656 109 768 Mozilla/5.0 (compatible; MJ12bot/v1.4.7; http://mj12bot.com/) 177 45 553 Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2) 170 42676 46334 The Knowledge AI 120 0 120 Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) (Note: The number of 404s and 200s might not add up to the total—there might be other requests that returned a different status not reported here.) MJ12Bot is the 8th most active client on my site, yet it has the top two spots for bad requests, beating out #3 by over an order of magnitude (35 times the amount in fact). But I don't have to worry about it since the email also stated they removed my site from their crawl list. Okay … I guess? [1] gopher://gopher.conman.org/0Phlog:2019/07/10.1 [2] https://mj12bot.com/ [3] https://www.ietf.org/rfc/rfc3986.txt Email author at sean@conman.org .