[HN Gopher] Google Analytics alternative that protects your data... ___________________________________________________________________ Google Analytics alternative that protects your data and your customers' privacy Author : doener Score : 165 points Date : 2023-05-07 08:57 UTC (14 hours ago) (HTM) web link (matomo.org) (TXT) w3m dump (matomo.org) | jacooper wrote: | Beware, Matomo by default isn't very privacy friendly, you will | need the GDPR banner for any advanced features. | | If you want a GDPR compliant analytics you have to disable many | of its flagship features, or use something else like Plausible, | designed to work with no consent. | viraptor wrote: | You need a GDPR banner for sharing information with third | parties. Why would you need one for self-hosted Matomo? | jeroenhd wrote: | You'll need consent for any data collection not essential to | your site's functionality, even if you host stuff yourself. | | If all you so is collect how often your pages are being | visited then you're not collecting any PII and you don't need | a banner, but if you're tracking visitors based on unique | identifiers (cookies, IP addresses, etc.) you'll need to get | consent first. | jacooper wrote: | Its not only that, its also about the data collected. | | Real time data like visitor data and heatmaps aren't allowed, | also IP tracking is not allowed too. | | Because matomo can be very powerful, more powerful than | Ganalytics. | | You can let it assign a unique id to every visitor if it | visited a subdomain and logged in, so you can now exactly who | each visitor is on all of your sites. | | https://matomo.org/faq/how-to/how-do-i-configure-matomo- | with... | BaudouinVH wrote: | https://umami.is/ does the same, free tier available. | marban wrote: | C'mon there's like a thousand threads on these already. | throwaway2056 wrote: | In all these threads, there is never a project manager from a | large establishment telling | | - Thanks. we will migrate - we did and it was <good/bad> | | More or less everyone is after $. | gerenuk wrote: | Usermaven.com does the same and covers product insights as well. | Free tier is pretty generous (1M events per month). | prithsr wrote: | Thanks for sharing this! Just got it set up on one of my | domains and very pleased. | nologic01 wrote: | Ideally the different self-hosted web stacks would have built-in | analytics that would not have to hit the client with javascript. | But they don't, or if they do each has its own inconsistent | approach to as what data is collected and how it is presented. So | the second best if you care about your user's privacy (and, if | applicable, your own commercial or institutional privacy) is | something like matomo. | earth2mars wrote: | Their usage of word "On-Premise" instead of "on prem" or "on | premises"!!! | ezekg wrote: | The terms on-premises and on-premise and on-prem are synonymous | within enterprise lingo. | TechBro8615 wrote: | The one that grinds my gears is "bottoms up" instead of | "bottom up." | colesantiago wrote: | Let's all just stop tracking all together. | | We don't need tracking at all and bloating up and slowing down | websites. | jeroenhd wrote: | I wonder if someone's made an AdNauseam for tracking libraries | yet. Going on the defensive clearly doesn't work, some more | offensive action is required. | | Send a whole bunch of plausible events, pretending to click | every link, changing your identifiers and stuff like resolution | every time, make it impossible to determine what data is real | and what isn't. Bonus points for leaving websites alone if they | don't load tracking scripts until you've consented. | | We can't stop trackers, but we can try to make them useless. | Even if they filter out such tracking they shouldn't be able to | figure out what data was real, making their tracking attempts | worthless. | jhpacker wrote: | I believe AdNauseam uses EasyList, so if it doesn't include | the EasyPrivacy part of that (which contains the trackers) by | default it seems like it would be easy to add. | | That said, I don't think this is an effective strategy at | all. Safari has placed a big giant hole in tracking (like 20% | of users) and lots of sites are still proceeding like nothing | has changed. Google referrer spam was run at mega-scale | dumping billions (at least) of spam hits into millions of | profiles and didn't effect tracking efforts. | | A plugin run by .0001% of users or whatever that adds in a | bunch of slop to the numbers just makes more analysts pull | out their hair rather than leading to change. | nicbou wrote: | This would be nice. I don't track users on my personal blog. I | don't give a flying duck about what people do there. | | However I make money from running a website which is really | useful to a lot of people. I absolutely need to know what works | and what doesn't. I can't write and edit in the dark, possibly | missing by a mile what my readers really need. It would be like | flying a plane without instruments. | | For instance, I see that a lot of people use the search to find | a single guide, which should definitely be linked on the home | page. Without basic tracking, I wouldn't even know which pages | are important to my users. | | There are many gaping holes in your website that you could be | completely blind to without a basic sense of what your users do | on your website. | | I also caught many illegal copies of my website through | referrer tracking. Three of them were phishing websites, and I | got them shut down. | | So there are many legitimate reasons to have basic traffic | counters, and you can have those while respecting your users' | privacy and following the spirit and the letter of GDPR. | ZacnyLos wrote: | More alternatives: https://european-alternatives.eu/alternative- | to/google-analy... | iLoveOncall wrote: | Half of the "alternatives" there are dead (website is offline) | and the rest is not free. Those are hardly alternatives, more | like band-aid solutions. | paulcole wrote: | A paid alternative to a free service is still an alternative. | | They may not be alternatives you like, but they are | alternatives. And for many people a paid option may be better | once they start looking at why the free thing is free. | hobo_mark wrote: | I'm going through the list at random (I'm on the market for | such a service) and none of them appear offline so far. | iLoveOncall wrote: | Yeah you're right actually. From my phone most of them | appeared online but from my computer, on the same network, | they're all fine. | graftak wrote: | This website is off to a bad start when the first item says | | > Because it does not use cookies there is no need to show | cookie banner for this service. | | which is a blatant lie/misinformation. The 'cookie law' has | nothing to do with the actual use of cookies. | jhpacker wrote: | To defend this site, that is the claim of the vendor and I | wouldn't expect a site that focuses on listing EU | alternatives to be critically evaluating a claim like that | which hasn't been explicitly nay-sayed by any regulatory | agency. Plausible uses a visitor id based upon a hashed + | salted user agent plus IP address where the salt is rotated | daily. The choice of whether consent is required for that is | for the individual implementing site to make up their mind | upon, but I don't think the vendor claim is unreasonable. | | A similar (but better, IMHO) site that focuses just on | analytics is: https://newmetrics.io/ | algustionesa wrote: | I have reviewed many Google Analytics replacements in terms of | features and capabilities. Matomo may be suitable for you, but | its data presentation is not user-friendly. If you only need | basic metrics to track, there are many alternatives that present | analytics data more clearly. For more information, see | https://algustionesa.com/google-analytics-alternatives/. | herunan wrote: | Why is this at the top of HN? | viraptor wrote: | I'm guessing because of this: | | https://blog.google/products/marketingplatform/analytics/pre... | | > All standard Universal Analytics properties will stop | processing new hits on July 1, 2023. | EGreg wrote: | So what is the main difference between Universal Analytics | and Google Analytics 4? | | We currently use Google Analytics to understand how users | move through our app. We also used Matomo (previously Piwik) | 5 years ago. | | Now Google Analytics on iOS will stop working for users | unless they update our app? It doesn't seem to say anything: | https://developers.google.com/analytics/devguides/collection. | .. | devjab wrote: | Analytics are widely used in communication departments in | European enterprise, and where that previously was very often | Google Analytics, it's hard to use it because of Google's | inability/unwillingness to change their enterprise targeting | business model to be GDPR compliant. I'm not personally | convinced you really need an analytics tool in most European | communications departments. As long as saying something like | that is akin to heresy, however, I think it's safe to say that | a lot of people are interested in alternatives to Google | Analytics. | | It's likely not just in Europe anymore. Privacy seems to be a | tend that is on the increase everywhere. But as I understand | it, things move to the top of HN if they are interesting to a | lot of people, and privacy is interesting to a lot of people | these years. Not just to the "nerds" either, at least I tend to | see more and more discussion on it outside of tech circles. In | the EU specific you do have the very real "motivation" of | dropping Google Analytics because using it puts you in the | lovely area of breaking the law. | haunter wrote: | Google bad | zichy wrote: | It is really easy to protect everyone's privacy by not using | advanced analytics platforms at all. | jmduke wrote: | A while back I built out a quick guide comparing all of these | alternatives, because the core value prop was pretty similar and | it was annoying to compare between pricing plans. (My personal | vote goes to Fathom.) | | https://buttondown.email/comparison-guides/google-analytics-... | berkle4455 wrote: | Stay far away from fathom. Bro culture bullshit at the worst. | Don't believe a word they say. | skilled wrote: | Fathom is run by some goofy marketer who has openly slandered | (on HN) other analytics products in this space. Sadly, can't | support anyone who does that. They're not open-source either. | 6ak74rfy wrote: | Last I checked, Fathom's open source product hadn't been | updated for a couple of years. So, I switched to Plausible | which is more reasonably updated. | graeme wrote: | How does Plausible compare to Google's Universal Analytics? | And are there any SEO effects? | | GA4 migration seems not aimed at general users, so I'm | looking at alternatives. Ideally could import my data. | gcanyon wrote: | "GA's interface is complex and confusing, especially for basic | use cases." | | As I said in another comment, it's been eight years since I | used that accursed interface, and I'll be ready to try it again | once the flashbacks go away. | jhpacker wrote: | Matomo is decent, but my main issue with it is the performance | when run at any sort of scale. It's PHP/MySQL, which is nice for | ease of self-hosting, but it means a lot of things need to be | pre-calculated. Most of the newer and more performant GA | alternatives out there are using things like ClickHouse. | | ClickHouse: Piwik PRO, Plausible, PostHog, Yandex, Cloudflare | | Snowflake: Amplitude, Piano, Snowplow | | SingleStore: Fathom | | I've written a book on the subject including evaluating the 15 | most widely used options: https://gaalternatives.guide | KronisLV wrote: | > Matomo is decent, but my main issue with it is the | performance when run at any sort of scale. It's PHP/MySQL, | which is nice for ease of self-hosting, but it means a lot of | things need to be pre-calculated. | | I've never actually run into performance issues, neither when | using it in production professionally, nor for my self-hosted | sites (with Matomo always running on-prem). I'd say the | performance of PHP and MySQL/MariaDB is most likely decent as | long as you don't go too far into specialized workloads, for | example log aggregation/tracing; though even some APM solutions | like Apache Skywalking also support using traditional RDBMSes | for this purpose as well: | https://skywalking.apache.org/docs/main/v9.0.0/en/setup/back... | | That said, I can't help but to wonder at what actual scale | (number of logged events/second, given certain hardware) you'd | run into issues. Luckily, because adding basic analytics is | usually quite easy, testing this for your own workloads | shouldn't be out of the question - then you can let the data | speak for itself. | jhpacker wrote: | The performance issues aren't with the measurement requests | but with reporting. | | When I eval'd it for my book last fall there were big delays | in reporting waiting for segments and then also issues with | custom reports. I think they have changed the default | behavior to get around some of the former, but with MySQL | it's always going to be tough for larger queries. | | (if there's any performance issue on the measurement side it | has more to do with the JavaScript payload because they | include a lot in their standard JS bundle). | [deleted] | preinheimer wrote: | I wish some of the privacy focused GA alternatives had SOC 2 | reports, or ISO 27001. We're working towards our first SOC 2, | which makes it hard to incorporate anything without one into our | product. | | On prem is a lot of work, and not something i want to approach | lightly. | npace12 wrote: | why? Having gone through a few SOC-2s, I don't see any value it | other than it being a racket. | ian0 wrote: | Having gone through ISO 27001 and PCI DSS level 2 I kind of | assumed all of these security focussed compliance standards | are just that. Anyone have any exceptions? | vlovich123 wrote: | Yes it's a huge racket that's likely does little to solve the | problems it was enacted to prevent. But have you tried making | deals with large SOC2 companies without your own | certification? | jhpacker wrote: | Piwik Pro is SOC2 certified. | TekMol wrote: | I tried Matomo. | | Self hosting is easy. That's a plus. | | I also like the interface. Took a while to get used to it but | after that, I liked it even better than Google Analytics. | | But one problem that seems unsurmountable is that it tries to be | clever. And while trying, it messes up your data. | | If you have pages with a parameter in the querystring that is | called "q", Matomo does not count those as pageviews. It tries to | be clever and only count those as "searches". Probably because | many site searches use a parameter "q" for what the user is | searching for. | | Even if a page is a search result page, it should be counted as a | pageview. | | The problem gets even worse when you have users bookmarking pages | with a "q" parameter. Then things get really messy when you try | to understand which pages users use, where they come from etc. | | I have searched a lot, but have found no way to disable this | "cleverness". And no way to retroactively fix the data. | ethor wrote: | Just disable website search or change the search parameter | inside website settings to stop Matomo from interpreting the | 'q' parameter as search. | ehnto wrote: | It is bad default, but nice that it is configurable. | SquareWheel wrote: | That's an odd choice as WordPress, which is by far the most | popular CMS, uses ?s= as a search query. | | I would expect those pages to be included in the data. They | could offer some sort of segmentation if they think they they | can separate out searches, though. | dfsl wrote: | I have integrated my site with matomo. | | The matomo analytics are captured and stored on-premises on my | server (nothing goes to the cloud). | | Performance is good with my configuration. You can see page | performance for yourself by loading this page: | https://freesoftware.life/how-to-install-kubuntu-23-04/ | riogordo2go wrote: | I'm using the matomo self hosted version and like it overall. I | love you can track all outbound clicks without having to | specifically add Dom elements to outbound links to make this | possible. Unfortunately matomo is blocked just like Google | Analytics by every ad/tracking blocker. Doesn't matter if you | host it yourself and only track global stats vs tracking users | across the web like GA does. The only solution seems to be | writing your own analytics. | belorn wrote: | At this point in history, tracking on the web is no longer a | trusted activity where people can assume that the person behind | the tracking is doing it for benevolent purposes. It's the same | thing with email and spam, especially when attachments are | involved. | | Writing your own analytics can give some additional benefits in | that you are only collected what you need while taking into | considerations your users needs. I expect however that in time | browsers will block more and more by default, similar in how | email clients and services has progressed in their arm race | with spam. | teekert wrote: | Is it also blocked when you don't even enable cookies? You | loose some accuracy, but clients can't prevent ending up in | your logs and they have to share some info with the server. | chpatrick wrote: | You can usually rename the tracker to something that's not on | the blocklist. | riogordo2go wrote: | That used to work but current block filters analyse js | variables and url parameters and are much harder to | circumvent. | jeroenhd wrote: | Why spend time undermining people's preferences? | RHSeeger wrote: | If I build a web site, and it is my preference to know what | pages get clicks on what elements (presumably, so I can make | my site better)... whose preference gets priority; mine or my | users? It's not as black and white as your question makes it | sound. | gumby wrote: | The users have the ultimate authority whether you like it | or not: they don't have to read your whole page, they don't | have to look at that image (or even load it), they don't | even have to go to your site if their friends tell them not | to. | | It's like going to pee when an ad appeared on TV back when | TV was a thing. The broadcaster and advertiser had no | control. | | I am sympathetic to your desire (I'm assume _your_ desire | comes from a good place),* but at the end of the day I | think we want to live in a world where the people are the | important part. | | * in my experience the best sales people really do believe | the prospective customer _does_ want what they are selling, | be it pantyhose, homeopathic drugs, or specially formulated | window washing fluid. | gregmac wrote: | It kind of _is_ black and white, from technology point of | view. | | You, the website owner, can control what your server does | in response to HTTP requests a client makes. You control | what data is sent, and under what conditions you'll send | that data (ie: presence of a valid session cookie, correct | username/password, cryptographly signed request, etc). | | I, the user owning a computer, get to control what my | computer does. I run a web browser, and can choose what | happens in response to data your site sends me via HTTP. | | Most notably, your site can send some javascript, but my | computer doesn't _have_ to run it. My computer can also | selectively block what it does, including limiting its | access to initiate web requests to other sites. | | Anything beyond this is artificial, such as laws like DMCA | or CFAA. | RHSeeger wrote: | Your response seems to completely miss the point of the | thread you're replying to. The discussion in question | was, effectively | | >>> You can write your own code to gather statistics | | >> You should respect your user's desires and not gather | statistics | | > The users aren't the only ones with desires | | Sure, whether or not you "can" do it is black and white | (and a game of whack-a-mole many times), but whether or | not you "should" do it is very much a gray area. | EGreg wrote: | I don't get it, how can they stop you from recording this | on your own server? | | Are you talking about CNAME cloaking? Pretty sure Apple | only cares if one specific server gets all the CNAMEs. It | doesn't block CNAMEs in general. | RHSeeger wrote: | I thought that was the whole point of what was being | said; that things like metrics (what on the page gets | clicked on) are getting blocked. Bear in mind, I'm not | just talking about what pages get loaded. There's more to | "clicked on the page" than just page loading. | jhpacker wrote: | ITP now also degrades first party server-set cookies to 7 | days where the first part of the IPs don't match. So if | you're using CNAMEs for your measurement and the you have | a.a.x.x and b.b.x.x it will downgrade. | EGreg wrote: | Link? | jhpacker wrote: | https://github.com/WebKit/WebKit/pull/5347 | riogordo2go wrote: | Because I think most people who use something like ublock | don't want to see ads or have their privacy violated by being | followed around the web using third party trackers. | | A site owner observing some general, anonymized stats like | visitor and page count, which outbound links are clicked, os, | screen size, time on page and what have you is quite | different. I understand a blocker must go all the way and | cannot distinguish between these cases. Hence my effort to | find an alternative. | nolok wrote: | Most people who are against trackers are not against the | website they visit getting valuable information about which | page they use or not, or the order in which they use each | page to figure out which path work or not, etc ... | | They are against the website chosing to not pay for it and | instead getting it for free in exchange for giving all that | data to a 3rd party (like GA / Google), who then uses it for | its own purpose. | | Doesn't mean no people are against that first scenario too, | but then they better not make an account, visit several pages | in a row on the same website or want to use a cart, or | essentially anything beyond a static website. | | Both scenarios are widely different, and convincing people on | both side (even both extreme) of that line that the line | doesn't exists is one of the greatest and most successfull | trick tracking companies have played. | gumby wrote: | An ad/tracking blocker _could_ discriminate between | privacy-protecting trackers and spyware, but it would not | be worth the time in practice. | | Such a distinction would need an option and have to be on | by default. Most people use the "out of the box" config, so | only a few people (like me) would enable honest tracking. | | The blockers would have to keep up with this option to make | sure the thing they allow hadn't switched to evil mode. | | And so on. Basically another case where bad actors like | google poisoned the well. | soared wrote: | > tracking users across the web like GA does | | What does this mean? | riogordo2go wrote: | At least in prior Google Analytics versions, a third party | cookie was used, giving the possibility to link you to every | site that implements Google Analytics. But Google explicitly | states not to do this, so you are correct in calling me out | here. | jhpacker wrote: | GA4 still uses the doubleclick cookie. It also encourages | the use of Google Signals and runs measurement requests off | of the main google.com domain to help it track users based | upon their Google login. | nolok wrote: | If site A and site B both uses GA, then GA track them across | both internally for their stats (and it helps google in | figuring out the same user has interest A and interest B). | | Matomo promises to not do the same link across properties on | their cloud hosted version. | stoicjumbotron wrote: | Thoughts on Microsoft Clarity? https://clarity.microsoft.com | victor106 wrote: | I think we learnt enough when big tech offers something for | "free" and when they call it "absolutely" free it just means | you are absolutely the product. | | So thanks but No thanks | hodgesrm wrote: | Clarity is awesome--the metrics and the way it combines a | visualization of our site with user session data is amazing. It | shows you the actual locations on the page that users visit as | well as the path they follow to get there. The insights are far | more actionable than Google Analytics from my experience. (We | use both.) | | p.s., Under the covers Clarity runs on ClickHouse. | gcanyon wrote: | I just want analytics that don't require a Ph.D. in obscure user | interfaces to get anything out of them. TBF, I haven't used GA in | 8 years, maybe it's gotten better -- but I still have flashbacks. | pastage wrote: | Matomo is not trivial run on prem, there are lots of stuff that | do not work on larger installs unless you do lots of manual | optimization, what those optimizations are is not obvious. The | problems only shows after some time when you have to redo reports | for multiyear periods, or handle hug of death. | | That said people love analytics, it is a powerful tool. | viraptor wrote: | Any links? I'm assuming you mean something more than the | periodic rollups? | RobotToaster wrote: | Worth noting that it seems to be only "open core", there's a | bunch of paywalled features that I presume aren't open source. | https://matomo.org/pricing/ | johndhi wrote: | I'm not an engineer. Can someone please specifically explain to | me how this protects data and privacy more than Google? | | Does it use cookies and browser storage? | jenadine wrote: | One thing is that it doesn't send any data about your users to | Google. | hrpnk wrote: | Rudderstack claims to be a GA alternative [1] and accepts server- | side data allowing this to be a 1st party integration skipping | the consent complexity. Any thoughts on this one? It also made it | to the Thoughtworks Tech Radar [2]. | | [1] https://www.rudderstack.com/replace-google- | analytics-4-guide... [2] https://www.thoughtworks.com/en- | us/radar/platforms/ruddersta... | encoderer wrote: | We (cronitor.io) have a really great all-in-one solution to | analytics and website monitoring with a generous free tier. | | https://cronitor.io/real-user-monitoring | jpalomaki wrote: | You can also run Matomo without tracking Javascript and instead | feed in log files [1]. This works with the Cloudfront log files | (and many others). | | [1] https://matomo.org/faq/general/requirements-for-log- | analytic... | IanCal wrote: | > Google Analytics alternative that protects your data and your | customers' privacy | | It's not your data, this data about me. | | > Your customers will love you because their valuable personal | data is protected. | | I guarantee you your customers will not love you for tracking | them. | igor47 wrote: | Your customers might love you for making a better website, and | this is hard to do without feedback, of which analytics is one | kind. | nmstoker wrote: | Haven't used it in six years but back in the days it was Piwik it | was ideal: easy to set up locally, a good range of features and a | friendly community (v. responsive to an upgrade issue we | experienced but apart from that everything worked exactly as | expected). | acidburnNSA wrote: | Back in my day it was awstats. Still works great. I have 18 | years of data. | | https://www.awstats.org/ | jhpacker wrote: | Loved AWStats! Still can be useful -- but bots, client side | caching, CDNs, and did I mention bots..? have made the data | hard to rely on for much. A while ago I switched from AWStats | to GoAccess (https://goaccess.io/) for this kind of thing. I | prefer its interface, and it's way way faster to churn | through big log files (C vs. Perl). ___________________________________________________________________ (page generated 2023-05-07 23:00 UTC)