[HN Gopher] Analyzing Analytics (Featuring: The FBI) ___________________________________________________________________ Analyzing Analytics (Featuring: The FBI) Author : benryon Score : 234 points Date : 2020-04-24 16:46 UTC (6 hours ago) (HTM) web link (exploits.run) (TXT) w3m dump (exploits.run) | LyndsySimon wrote: | The big takeaway from this article for me is that I should | probably look for or write a browser extension that tracks | changes to analytics tools and IDs on sites. If a site is | silently taken over, the state actor would either need to | separately gain access to the analytics tool accounts, or would | need to modify the IDs to connect to a new account. I'd love to | see how often tracking IDs change on high-profile sites. | jcrawfordor wrote: | Looking at the Siberian husky site... stdLauncher.js is part of | Verint ForeSee, one of those "would you like to take a survey | about our website" solutions. The AAM analytics code right above | the survey and urchin code lists as domain an IP associated with | Sungard AS, an outfit that holds a number of federal contracts | for IT services. This IP, 209.235.0.153, hosted the FBI website | at some point in time. It's oddly easy to figure this out, even | without something like a DomainTools subscription, because there | are a lot of people scraping and archiving the FBI most wanted | pages due to their cultural significance. | | Some searching on code samples shows that the AAM section of | analytics code is an exact match for analytics code served up by | an older version of the FBI's most wanted website. Likely that it | was also used on older versions of other FBI websites as well. | | In the end I find it unlikely that this website has anything to | do with the FBI, and more likely that the website owner copy- | pastad a large section of source code and accidentally ended up | with this result. | | One bit of commonality I've noticed is that a lot of websites | with the FBI tracking code were all built with FrontPage. I'm not | sure if this is causal or coincidental, but perhaps it | contributes to this that FrontPage allows you to open a webpage | that you saved from IE and edit it... which might lead to some | websites being complete duplicates of FBI websites, except for | visible content, simply because websites like the FBI most wanted | were relatively prominent parts of the early internet. | | Edit: I spent a little time riding the WayBackMachine to some of | the other webpages when they were apparently using FBI analytics | code. The results are odd but they're so inconsistent that it's | hard to think it was at all intentional. One interesting finding | is that both ohthx.com and ppc-guy.com, at the time they | supposedly had the FBI analytics ID, were apparently hosting an | analytics package called Prosper202 that redirected the | WayBackMachine crawler from the login page to fbi.gov. I have a | suspicion that this was a partially-joking way to deter crawling | of the admin interface of the software. The record that they used | the FBI analytics code is presumably just an artifact of the | crawler following the redirect. It seems that this exact | Prosper202 behavior results in the majority of the old hits. | A4ET8a8uTh0 wrote: | That is a fascinating read. It sounds like it is also prudent to | use separate analytics ID on your websites if you choose to go | that route. | maerF0x0 wrote: | In his book "Permanent Record" Edward Snowden[1] describes fake | websites used by government agencies to disguise internet traffic | that is actually use for spy craft stuff. | | eg: maybe a website about siberian huskies actually has a hidden | login or hosts another service when contacted on port 80/443 in | just the right way? | | Now, that would make more sense for the CIA than the FBI, but I | think it illustrates another avenue of interpretation | | [1]: https://www.goodreads.com/book/show/46223297-permanent- | recor... | poyu wrote: | That doesn't make sense, why would they even let people know | that there's a connection? The hidden login part may be true, | but just not on a sites that are related so obviously. It could | be a smokescreen of some kind though. | maerF0x0 wrote: | I agree that having a fbi google analytics would be a gaffe | LyndsySimon wrote: | Interesting. I've got his book on my reading list, but haven't | gotten to it yet. | | I just made a tenuous mental connection between this concept | and a Reddit phenomenon called "Lake City quiet pills". I heard | about it on the podcast "Stuff They Don't Want You To Know" and | it held my interest for a few hours' worth of investigation. | | The short version is that a Redditor died. He was a | stereotypical grumpy old dude, and someone hopped on Reddit and | posted that he'd passed. Someone got interested and tied that | poster to some websites, one of which had a bunch of stuff | hidden in the public source. It definitely seems like a | clandestine group of some kind communicating, but to who it was | and to what end isn't clear. The Reddit conspiracist belief | seems to be that it was a group of assassins-for-hire. | | Podcast: https://www.iheart.com/podcast/182-stuff-they-dont- | want-you-... Subreddit: | https://www.reddit.com/r/LakeCityQuietPills/ | joering2 wrote: | "Pet scam" is a big business [1] | | In this example its quite possible FBI put their traps to get | better understanding what third parties are involved; who is | visiting the site, and probably some admin management page | behind it. Sort of like get the contacts of a criminal and go | from there. | | [1] https://www.ipata.org/current-pet-scams | galacticaactual wrote: | Pointing back to a government domain is not how nation state | monitoring infrastructure is set up. | [deleted] | save_ferris wrote: | Sure, this isn't a comprehensive strategy, but you'd be amazed | at how far behind some of those agencies are in terms of day- | to-day operations for investigations. | | A relative of mine works at FBI and several years back he told | me a story about how an investigation into an organized crime | syndicate was blown up because an agent on the case was dumb | enough to check out the target's LinkedIn profile while he was | logged into his own real account. So the target got a | notification that Joe Blow from the FBI had just viewed his | profile. Over a year of work down the drain with a single GET | request, crazy. | galacticaactual wrote: | My issue is the confidence with which the author presupposes | that the existence of this code on sites indicates seizure or | utilization in an investigation. It is a lazy position that | leaves others (i.e. HN readers in this thread) with a little | more intellectual horsepower to evaluate the other - and | frankly more realistic - alternatives. | not_a_moth wrote: | What are the more realistic alternatives? | galacticaactual wrote: | Please refer to the (current) top comment. | mimi89999 wrote: | Please see my comment: | https://news.ycombinator.com/item?id=22970996 | mimi89999 wrote: | Or maybe they just stolen code from FBI website to have a feature | and pulled way more code than required without even knowing what | it does. | mimi89999 wrote: | A coworker sysadmin once told me that when he was inspecting | the web server access logs (for an unrelated reason) he noticed | that many requests to a resource on our website have a strange | referer URL that was never present in requests to pages. He | inspected that site and found that they were using our | resource. We didn't really care about it, but that was really | interesting. | | Maybe it's the same with these sites? | three_seagrass wrote: | This technique was recently done by some redditors to uncover | that the multi-state COVID reopen protest is being pushed by some | guy who uses an antique shop in FL as a front for his shell LLCs. | | They are the websites that are being used on the facebook pages | that are primarily pushing 'reopen' content, and the GA accounts | on those pages links them to a bunch of pro-firearm shell corps | as well. | | Here's the thread. It got deleted since it was deemed as doxxing | (a reddit no-no) even though Whois data is public: | | http://removeddit.com/r/maryland/comments/g3niq3/i_simply_ca... | maxchehab wrote: | Krebs also mentioned this in his recent post | https://krebsonsecurity.com/2020/04/whos-behind-the-reopen-d... | | A very interesting way to associate the same site owners! | LegitShady wrote: | Look at the updates on that post, nothing is so clear cut. | The problem with internet sleuthing is that everyone gets | very excited and innocent people can be injured in the | unnecessary witch-hunt. | | >Update, April 21, 6:40 a.m. ET: Mother Jones has published a | compelling interview with Mr. Murphy, who says he registered | thousands of dollars worth of "reopen" and "liberate" domains | to keep them out of the hands of people trying to organize | protests. KrebsOnSecurity has not be able to validate this | report, but it's a fascinating twist to this tale: How an | 'Old Hippie' Got Accused of Astroturfing the Right-Wing | Campaign to Reopen the Economy | | Update, April 22, 1:52 p.m. ET: Mr. Murphy told | Jacksonville.com he did not register reopenmn.com or | reopenpa.com, contrary to data in the spreadsheet linked | above. I looked up each of the records in that spreadsheet | manually, but did have some help from another source in | compiling and sorting the information. It is possible the | registration data for those domains got transposed with | reopenmd.com and reopenva.com, which included Mr. Murphy's | information prior to being redacted by the domain registrar. | panda-giddiness wrote: | Right, and this is exactly why reddit bans doxxing. The | original reddit poster was correct that there was a single | individual buying most of these domains; however, other | than purchasing the domains, there was no evidence that the | individual was using those domains to promote protests. | Let's not forget reddit's Boston bombing debacle[1]. | | [1] https://en.wikipedia.org/wiki/Sunil_Tripathi#Misidentif | icati... | whycombagator wrote: | Sadly I get "Could not connect to Reddit" when visiting that | link | gk1 wrote: | Worth noting that, since the Analytics ID is the publicly | visible, anyone can load Google Analytics on their own site using | that ID. No FBI connection required. | | This is called Analytics hi-jacking and it was once (still is) a | common spam technique: Create site buy-my-stuff.net, load a bunch | of hijacked analytics scripts there, and then the owners of those | accounts will see "but-my-stuff.net" in their analytics reports. | | Edit: As commenter lmgk reminded me, you don't even need to make | a site, just use the API to make pageview calls. | enlyth wrote: | Is it not possible to whitelist your own domains in Google | Analytics? Forgive my ignorance, I don't use it at all. | lmkg wrote: | You don't need to host a site. The data format to send data | into Google Analytics is an open API (called the Measurement | Protocol). You can just ping Google's servers directly with | the appropriate payload, which include crafted URL | parameters. | allset_ wrote: | Any info about what domain is being visited would be client | side, which could be easily changed. | codezero wrote: | The actual google analytics account has a setting admins can | control to only allow data from specific domains though this | can be faked. | | Also, usually these IDs are copied when someone clones a | website they want to steal the design of but they don't | bother updating the style or JS. | [deleted] | vmception wrote: | Now this is the Hacker News I want to see. Just a mere | observation using known meta-analytics with entertaining | implications. | [deleted] | londons_explore wrote: | Google analytics ID's are tied to the account that created them. | | Presumably the FBI doesn't all share just one massive | "fbi@gmail.com" email address. | | Even if a bunch of FBI employees decided foolishly to use google | analytics on their honeypot sites, one would expect them to all | separately sign up using different google accounts - either using | their real email addresses, or hopefully throwaway ones. | lmkg wrote: | I think you're confusing _Google_ accounts (email addresses) | with _Google Analytics_ accounts (tracking ID prefixes). A | single user can create dozens of GA accounts. | Ozzie_osman wrote: | Google analytics used to be called Urchin (they bought Urchin and | made it Analytics). So all the urchin.js code is probably just | really old Google analytics tracking code. | elbac wrote: | The original Urchin was used for log analysis | https://en.wikipedia.org/wiki/Urchin_(software). Which might | explain a 'self-hosted' version of the software as well. | vmception wrote: | The article says all three fbi.js files were on waybackmachine. I | was only able to download urchin and the other ones are not | there. Anyone have a mirror? Besides the author? pastebin or mega | jcrawfordor wrote: | All three are from commodity commercial software, finding other | websites of the same period that used Urchin/GA and ForeSee | should get you more or less the same files. | danso wrote: | On a related note, I wonder if there are/were common patterns in | the sting sites set up by Dept. of Homeland Security, such as U | of Northern New Jersey [0] and U of Farmington [1]. Both of those | were initiated during the Obama administration and featured | fairly nice modern designs, similar in aesthetic to much of the | Obama-era digital overhauls (though a quick skim shows that they | don't share similar CSS naming semantics). | | [0] https://www.nytimes.com/2016/05/06/nyregion/students-at- | fake... | | https://web.archive.org/web/20160327093120/http://unnj.edu/ | | [1] | https://www.freep.com/story/news/local/michigan/2019/11/27/i... | | https://web.archive.org/web/20180414235355/http://university... | | http://archive.is/qLrUi | jcrawfordor wrote: | On brief review, one (UNNJ) is running WordPress while the | other (Farmington) doesn't show any evidence of a dynamic CMS. | That suggests to me totally separate provenance. My guess would | be that two different contracts were awarded to two different | companies to build the websites, which would both be consistent | with common federal contracting behavior and a good idea from | an OPSEC perspective since it would minimize any similarity in | these "sting" websites. | bhartzer wrote: | The Google Analytics 'trick' (to identify all the sites someone | owns) has been around for quite a while. All you have to do is | use a code search engine like publicwww to search for the snippet | of code or the analytics ID. | | It's not just the Google Analytics ID or GTM Id, you can also use | the Adsense pub-id or just about anything else that you might | think sites have in common. When you start to also look at | backlinks and IP neighborhoods, things can get interesting, as | well. ___________________________________________________________________ (page generated 2020-04-24 23:00 UTC)