[HN Gopher] Lightweight Alternatives to Google Analytics
       ___________________________________________________________________
        
       Lightweight Alternatives to Google Analytics
        
       Author : Tomte
       Score  : 603 points
       Date   : 2020-06-18 07:56 UTC (15 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | mobilio wrote:
       | Analytics isn't count a pageviews.
       | 
       | GA also counts - social interaction, events, Ecommerce and many
       | more.
        
       | netcan wrote:
       | Tangential but...
       | 
       | "Analytics" is rarely useful or unuseful because of the tool.
       | These tools need to be treated as data collection, not reporting.
       | 
       | If your goal is to inform certain decisions, track success or
       | identify problems... a spreadsheet (or napkin) is usually where
       | that happens.
       | 
       | Say you do analysis systematically, make a list of questions and
       | use your tools to answer them... usually you find that the tool
       | itself doesn't matter much, and GA doesn't answer most of your
       | questions out-of-the-box anyway.
       | 
       | Say you want a "funnel." That usually consists of a handful of
       | data points. GA usually doesn't have them by default, without
       | tinkering configuration, etc. Decide what they are beforehand.
       | Understand them. Use GA (or whatever) to get the data.
       | 
       | Finding the tool for the job is much easier once you know what
       | the job _is_. GA is extremely noisy, bombarding users with half-
       | accurate, half-understood reports.
        
       | mauserng wrote:
       | Piwik PRO is also worth checking out. Our functionality matches
       | GA. Our analytics is more focused on user privacy and is inline
       | with privacy laws like GDPR, CCPA or industry regulations like
       | HIPAA or EBA guidelines.
       | 
       | Check out a comprehensive comparison of GA, GA360 & Piwik PRO at:
       | https://piwik.pro/blog/piwik-pro-vs-google-analytics-compreh...
        
         | mauserng wrote:
         | Actually I meant sharing this link to the comparison table:
         | https://piwik.pro/piwik-pro-vs-google-analytics-vs-ga360/
        
       | komali2 wrote:
       | Goat counter's been good and doesn't fingerprint.
        
         | JackWritesCode wrote:
         | Yes it does -
         | https://github.com/zgoat/goatcounter/blob/master/docs/sessio...
        
       | huhtenberg wrote:
       | If you decide to migrate off GA, there's very little reason to
       | not use self-hosted analytics.
       | 
       | The only case when you'd get better analytics from a _service_ is
       | exactly a GA-like setup that can track people as they go from one
       | website to another. That is, the real value of an analytics
       | service is derived directly from its ability to invade people
       | privacy, at scale.
       | 
       | Granted, migrating to another service is usually simpler, but it
       | offers NO insights into the traffic that you can't get from
       | parsing server logs and in-page pingbacks. You do however get a
       | 3rd party dependency and a subscription fee.
        
         | bryanrasmussen wrote:
         | >The only case when you'd get better analytics from a _service_
         | is exactly a GA-like setup that can track people as they go
         | from one website to another.
         | 
         | I was once making a service that provided cross site widgets
         | for companies to embed. Obviously it was beneficial to track
         | people as they go from one website to another, but at that
         | point it was beneficial to do it with our own service.
        
         | bttrfl wrote:
         | Server logs only tell you about things that happen on your
         | server. If you are using JavaScript it's likely there are
         | plenty of events that might be valuable to you that never leave
         | a trace in your logs.
         | 
         | For example, if you validate forms with JS you might want to
         | track form submissions and validation errors.
        
           | huhtenberg wrote:
           | _... and in-page pingbacks_
        
         | lmkg wrote:
         | > that can track people as they go from one website to another
         | 
         | Note that even in Google Analytics, this requires extra set-up,
         | has limitations, and tends to be pretty fragile in practice. GA
         | identifies users by a first-party cookie and tracking cross-
         | site visits requires decorating links with cookie values.
         | 
         | If you're interested just in aggregate traffic from one of your
         | sites to another, rather than something that requires full-path
         | analysis (like marketing attribute), then you can get that from
         | looking at referrers. This should be more-or-less equally
         | available in GA and server logs.
        
         | jedimastert wrote:
         | > If you decide to migrate off GA, there's very little reason
         | to not use self-hosted analytics.
         | 
         | My personal domain[0] was taken by domain squatters (forgotten
         | bill in debit card shuffle, bought up within seconds of expire)
         | so for now I have to host on github.io. Thoughts on an
         | analytics service?
         | 
         | [0]: http://www.aarontag.com/
        
         | elondaits wrote:
         | My reason is the server not being able to handle the traffic.
         | We used Piwik but I couldn't trust it'd be able to handle big
         | eventual spikes of traffic (which the site itself could, being
         | static and on a CDN) or that it wouldn't slow the site down (if
         | I remember correctly I had the option to call piwik
         | asynchronously and not slow down the site, but at risk that
         | it'd be less accurate if people closed the window / navigated
         | to another page quickly.).
         | 
         | Of course you can run your own analytics on AWS or similar and
         | have no issues with handling traffic, but that means higher
         | costs / difficulty in setting up and maintaining it.
        
       | dynamite-ready wrote:
       | I often read posts stating a great deal of discouragement for
       | self developed analytic solutions. I've recently developed a home
       | rolled solution for a product I'll hopefully launch soon.
       | 
       | There's no UI for my signals application, but it should give me
       | raw access to all the GA metrics people typically look for
       | (pages, referrers, user agents, etc). Storage could be pain
       | point. Compute might also turn out to be, but if that ever
       | becomes a problem, then I'll probably have much more to think
       | about...
       | 
       | What am I missing here?
        
         | JackWritesCode wrote:
         | Opportunity cost. Hacker News has some of the best developers
         | in the world reading it. $100-500 / hourly rates. Why would
         | they spend their time building & maintaining something when
         | they could pay $140 / year for a top tier, privacy-focused
         | analytics product like Fathom, where both founders work full
         | time on it?
         | 
         | Then you have redundancy of data. How are they backing up
         | historical data? Are they running with failovers? How will
         | their analytics do in the event that they get a hug of death
         | from Hacker News or Reddit? There are so many factors to
         | consider.
         | 
         | I don't think we should discourage people from rolling their
         | own if they enjoy it. Heck, I've built things that existed.
         | That's how we get better. I'm just sharing why a lot of
         | developers won't roll their own.
        
       | nuccy wrote:
       | I'm honestly curious, are all the analytics tools, which rely on
       | making third party queries, still efficient with extensive use of
       | adblocking these days?
       | 
       | If not, then logs of webservers are the only 100% reliable place
       | (if available of course), so old-style tools like awstats,
       | Webalizer, etc [1] should have a rise in popularity again.
       | 
       | [1] https://en.wikipedia.org/wiki/List_of_web_analytics_software
        
         | JackWritesCode wrote:
         | Server side Analytics are far from reliable. Go and try
         | Netlify, then you'll see how unreliable they really are.
         | 
         | For us nerds, we can always block things using DNS level
         | blocking but Fathom's custom domain feature has done really
         | well for the majority: https://usefathom.com/blog/custom-
         | domains-embed-code
        
           | pkalinowski wrote:
           | Good adblockers do DNS lookup to see if subdomain points to
           | tracking server anyway.
           | 
           | Only completely self-hosted (your domain, your tracking
           | server) solutions are resilient to adblocking
        
             | JackWritesCode wrote:
             | For now
        
             | nuccy wrote:
             | Interesting, then as a workaround trackers may create an
             | image with pseudo-random src pointing to
             | pageXXXwebsiteYYYclientZZZ.ad.yoursite.com (yoursite.com is
             | the site which serves the actual content), while asking the
             | owner to point NS record for ad.yoursite.com to IP of their
             | DNS server. So HTTP/S request or DNS request can reach them
             | anyway. Obviously DNS caching will prevent them from
             | knowing how many times this particular page was accessed by
             | this particular client, but at least they will know that it
             | was accessed at least once.
        
             | Carpetsmoker wrote:
             | You can set up a proxy, which isn't too hard (and operating
             | a proxy is a lot simpler than operating an analytics
             | service)
        
         | elcomet wrote:
         | The issue is that people often use free reverse proxies like
         | cloudflare, that do caching, so not all requests reach the
         | original server.
         | 
         | In this case, the source of truth is cloudflare's loadbalancer,
         | but you have to pay them to get full analytics.
        
       | steviedotboston wrote:
       | Until one of these alternatives is completely free like Google
       | Analytics is I don't see a massive shift happening. I'm a web
       | developer and there's no way I'd convince my clients to pay
       | $20/month for something that Google offers for free.
        
         | JackWritesCode wrote:
         | There's a huge shift happening. People are realizing that
         | Google Analytics isn't "free". You're sending data to a company
         | that has a huge amount of privacy scandals.
         | 
         | We have lots of agencies who use us, and their clients may
         | differ from yours, but here's what's helped them:
         | https://usefathom.com/blog/switch
        
           | FalconSensei wrote:
           | > There's a huge shift happening. People are realizing that
           | Google Analytics isn't "free".
           | 
           | Just be aware that, while that is true for HN, and some
           | subreddits, that still doesn't apply to most people. It's the
           | same as saying, based on comments on HN, that people are
           | moving away from Chrome when its market share is not going
           | down.
        
       | AdriaanvRossum wrote:
       | Thanks for mentioning Simple Analytics [1]. We are at this point
       | indeed only cloud based. We believe we need to make a business
       | case/profit first before putting a lot of extra work in a open
       | source version and maybe failing with the business. It's a dream
       | to make it open source, but not at this time.
       | 
       | We are very firm on our values. We will never sell your data. We
       | have many ways to get your raw data out of our system (API,
       | download links, ...).
       | 
       | Our collection script [2] is open source and today we are also
       | adding source maps to our public scripts. Open source does not
       | guarantee that a business runs that same software as their cloud
       | based option. We are looking into services that can validate what
       | we collect on our servers. We never collect any IPs of personal
       | data [3].
       | 
       | Great to see more products that care about privacy, I hope they
       | will really care and commit to their values for a long time.
       | 
       | [1] https://simpleanalytics.com
       | 
       | [2] https://github.com/simpleanalytics/scripts
       | 
       | [3] https://docs.simpleanalytics.com/what-we-collect
        
       | rickette wrote:
       | https://count.ly also looks pretty neat as a self hosted
       | solution. Anyone experience with that?
        
       | dclusin wrote:
       | I use GoAccess. It's an offline access.log analytics engine. One
       | feature it has is to generate static site from its db. I have an
       | hourly cron script that picks up the last hours logs and
       | generates a static site. You can see it in action at
       | https://www.clusin.com/analytics/
       | 
       | 1 - https://github.com/allinurl/goaccess
        
         | mimimi31 wrote:
         | I've tried GoAccess in the past, but I remember the
         | documentation not being very thorough on certain topics like
         | the databse, websocket connection, or log syntax. So it was a
         | bit of a pain to set up.
         | 
         | It also had some weird quirks like generating duplicate entries
         | or randomly failing to parse some log lines (you seem to have
         | quite a few of those "failed requests" yourelf by the way).
         | 
         | There also doesn't seem to be a good way to display statistics
         | for multiple virtual hosts. Even if you change your log format
         | to include the host, you just get an additional table in the
         | dashboard, but still can't look at the other metrics for each
         | host separately. You'd have to run multiple GoAccess instances
         | to achieve that.
        
           | dclusin wrote:
           | Yeah I definitely had to open some issues to understand how
           | it works. I have multiple virtual servers as well and wasn't
           | able to get it to break out links by virtual server.
           | 
           | I figured it's fine for my needs since I literally have
           | nothing on my domains. I could see it being frustrating for
           | power users.
        
       | severak_cz wrote:
       | I have my own solution inspired by plausible.io and Simple
       | analytics. See https://tildegit.org/severak/millions
       | 
       | I used Matomo before, but simple dashboard in style of
       | plausible.io is more useful for me. I have a little traffic on my
       | sites.
        
       | JackWritesCode wrote:
       | Thanks for all the kind words about https://usefathom.com. We've
       | been in business since 2018 and both founders now work full time
       | on it.
       | 
       | We're fully bootstrapped, actively rejecting millions of dollars
       | in venture capital, and we are sustainable. That is the key. We
       | are priced fairly, and at a level that allows us to ensure the
       | longevity of our business.
       | 
       | We are used by governments, small businesses, multi billion
       | dollar companies and individuals. Everyone cares about privacy
       | and legal teams love us.
       | 
       | We are fully GDPR compliant and use zero cookies. A lot of people
       | have read our article on cookie-free tracking, but that article
       | is outdated now. We're also launching a new method over the next
       | few weeks which is game changing, which we'll blog about.
       | 
       | Our infrastructure is highly available and runs across multiple
       | servers. We don't run our services from a single VPS, and have
       | availability in multiple availability zones for everything. We
       | pay premiums for our infrastructure because we take our customers
       | data very seriously.
       | 
       | We allow you to set-up a custom domain in less than 2 minutes,
       | comfortably passing ad-blockers. Or if that's not your cup of
       | tea, you can enable honor-DNT and respect ad-blockers.
       | 
       | We are built to handle billions of pageviews a month, we've
       | poured hundreds (thousands?) of hours into refining our
       | aggregation script, and we're the leading solution on the market.
       | 
       | Don't forget, we also offer unlimited uptime monitoring as part
       | of your plan, sending alerts by SMS, Telegram, email and Slack.
       | 
       | Finally, we run a popular podcast called Above Board, where we
       | talk about business & privacy.
       | 
       | If you haven't already checked us out, you should.
        
         | PhilippGille wrote:
         | > We are fully GDPR compliant and use zero cookies.
         | 
         | From the GitHub repo [1]:
         | 
         | > At present, Fathom Analytics Lite is not PECR compliant due
         | to the fact that it uses an anonymous cookie. Our PRO version
         | is PECR compliant, and we'll be making changes to this codebase
         | some time in the future to make it compliant.
         | 
         | The open source version seems to be lacking behind and might
         | generally be treated without much love, given that their
         | website doesn't even link to it (which is understandable from a
         | business point of view, but doesn't inspire much confidence in
         | its future maintenance).
         | 
         | [1]:
         | https://github.com/usefathom/fathom/blob/69baac5c4a4d96880a2...
        
           | JackWritesCode wrote:
           | Our goal is long term sustainability of privacy-focused
           | analytics. We're achieving that. We tried to do it with the
           | OS codebase, and the original tech guy left the project. The
           | MRR was around $1,300 between 2 people after many months.
           | 
           | When we focused on building a business, we were able to
           | become a much more viable competitor to Google Analytics.
           | 
           | The reason we haven't written off Fathom Lite is because
           | we've always had plans to come back to it this year and put
           | out an update. Will we be launching new features? No. Will we
           | be fixing bugs and ensuring it's a solid product for
           | individuals? Absolutely.
        
       | css wrote:
       | I have always used AWStats [0] and never thought I needed more
       | information than that.
       | 
       | [0]: https://www.awstats.org/
        
         | acidburnNSA wrote:
         | Same. Plus I have 15 years of continuous browsable data now. I
         | check it all the time and it tells me what I want to know.
        
       | r3trohack3r wrote:
       | We launched https://everytwoyears.org today. It was my first
       | project where I felt analytics was necessary, but also a moral
       | quandary. For personal reasons, I'm very against PII big data.
       | For project reasons, the project is literally about stopping mass
       | surveillance so shipping a tool like Google Analytics was firmly
       | off the table.
       | 
       | I went with https://app.usefathom.com which tracks _aggregate
       | anonymized_ data.
       | 
       | They have the option to self host, but I'm sending them money to
       | support the project. With today's launch, I'm really happy with
       | the product. Will continue using it.
        
       | wprapido wrote:
       | A happy Matomo user
        
       | PStamatiou wrote:
       | Been using Fathom for a few months now (after migrating away from
       | Gaug.es which I've had for at least 6-7 years prior but they got
       | acquired by some random companies that has zero support and makes
       | no improvements) and have been loving it so far. They're really
       | responsive and always improving things. I like that I can cname
       | the tracking script to my domain
        
         | paulcpederson wrote:
         | Also using fathom and really like it! Not having to put up a
         | GPDR tracking popup is very nice, I find those quite annoying.
        
         | JackWritesCode wrote:
         | Thanks so much, really appreciate the comment here :)
        
       | buro9 wrote:
       | I've recently been pondering whether to create a log receiver
       | that will produce Prometheus metrics as well as logs for Loki.
       | 
       | Why?
       | 
       | Because there are several open source projects that if joined up
       | in a relatively simple way would provide a full RUM / Analytics
       | solution.
       | 
       | For collecting the analytics: Akamai Boomerang, which is a
       | descendant of Yahoo tooling https://github.com/akamai/boomerang
       | and BSD licensed
       | 
       | Then insert a collector that will produce Prometheus metrics and
       | write log lines. The metrics will provide the timing information
       | and in many ways will be richer than Google Analytics, and the
       | log lines will provide the potentially high cardinality of string
       | based data like user agents, URIs, etc and Grafana supports
       | PromQL queries against logs such that you can gain metrics from
       | the log lines too.
       | 
       | Then add in a free Grafana Cloud
       | https://grafana.com/products/cloud/ and configure Prometheus to
       | scrape from the collector, and Loki to consume the logs.
       | 
       | This is an end-to-end cloud hosted RUM / Analytics solution that
       | for a single user would be free and one can even add alerting.
       | 
       | The missing link is that collector, to consume the Boomerang
       | output and produce Prometheus metrics and log lines for Loki.
       | 
       | All of this is open source and can be self hosted, the only piece
       | you would have to host today would be that custom collector to
       | receive the Boomerang requests.
        
       | ahstilde wrote:
       | The best alternative to Google Analytics is Parse.ly [1]. It's
       | privacy-conscious, reliable, and user-friendly. Most importantly,
       | it gives you important engagement metrics instead of vanity
       | metrics.
       | 
       | [1]https://parse.ly/overview
        
         | XCSme wrote:
         | If you are related to parse.ly: it's really confusing that
         | clicking "pricing" doesn't show the pricing first.
        
       | pcmaffey wrote:
       | You can always roll your own basic analytics, eg
       | https://www.pcmaffey.com/roll-your-own-analytics
       | 
       | Doing so is extremely helpful to understanding what events and
       | data you actually need for your use case.
        
       | dougblackjr wrote:
       | Engauge Analytics is a good alternative, and privacy focused:
       | https://engaugeanalytics.com/
        
       | krlx wrote:
       | I too wished for a simpler, less invasive but still Javascript
       | based, esthetically pleasing and free analytics solution (I am a
       | cheap student) : https://github.com/Karalix/feu-analytics
       | 
       | The Firebase free tier seemed perfect for my use case. It is far
       | from being perfect, but Good Enough For Me(tm)
        
       | jbrooksuk wrote:
       | I've been happily using Fathom Analytics: https://usefathom.com
       | and I have zero complaints.
       | 
       | No tracking. Privacy focused. Lightweight. You embed from your
       | own domain. They even do site monitoring now!
        
         | lhdj wrote:
         | My company switched to Fathom from GA about 4 days ago.
         | 
         | We build privacy software so it felt slightly hypocritical to
         | use a privacy-intrusive service like GA. So far so good.
         | 
         | I went from 0 to Fathom in under 20 mins and for our _basic_
         | requirements it works really well .
         | 
         | Good job Fathom team :)
        
           | JackWritesCode wrote:
           | Thanks so much, glad you had such a good experience :)
        
             | ghawkescs wrote:
             | When I try to view your demo page I get 'Secure Connection
             | Failed' every time. Firefox 77.0.1 on Windows 10.
        
               | JackWritesCode wrote:
               | Strange. Fully valid certificate. Try hard refreshing a
               | few times.
        
               | ghawkescs wrote:
               | Still no luck, this is the URL
               | https://app.usefathom.com/share/lsqyv/pjrvs. Did not load
               | in Chrome either.
        
               | JackWritesCode wrote:
               | So strange. I've run it through multiple checkers and all
               | of them are valid. No other issues reported, just this
               | one.
        
         | spockz wrote:
         | From the site:
         | 
         | > Our on-demand, auto-scaling servers will never slow your site
         | down. Our tracker file is served via our super-fast CDN, with
         | endpoints located around the world to ensure fast page loads.
         | 
         | This suggests that this solution is not self hosted. Is there a
         | solution like this which is really self hosted? This service is
         | one small change away from actually tracking.
         | 
         | Edit: Piwik/Matomo[1] appears to be the most mature one. [1]:
         | https://matomo.org/
        
           | XCSme wrote:
           | I am also buliding something similar: https://usertrack.net/
           | 
           | I think the main differences compared to Matomo is that it's
           | simpler (less features), but provides for much cheaper some
           | of their premium features (heatmaps, session recordings).
           | 
           | Let me know if you have any questions about userTrack or any
           | suggestions! :)
        
           | tutuca wrote:
           | Fathom is open source https://github.com/usefathom/fathom
        
             | m90 wrote:
             | Apparently this repo contains "Fathom Lite", a (from a
             | codebase perspective) unrelated predecessor of what is
             | currently being sold as a SaaS.
        
               | GordonS wrote:
               | And it seems that Fathom Lite misses one of the main
               | selling points of Fathom - cookieless tracking.
        
               | JackWritesCode wrote:
               | Yes - Fathom Lite would need a good refactor to not use
               | cookies.
        
             | tedivm wrote:
             | That's the old project- they have decided not to open
             | source the new one.
             | 
             | The open source project is barely maintained at this point-
             | they update the readme and get the occasionally pull
             | request, but it's not really being developed.
             | 
             | I unfortunately switched to Fathom back when they were
             | telling people they were committed to open source, so now
             | I'm looking to migrate off to something a bit more
             | trustworthy.
        
               | JackWritesCode wrote:
               | You've been saying this since last year. If I can help
               | you migrate off of Fathom Lite to something else, please
               | let me know.
        
           | JackWritesCode wrote:
           | Fathom Lite is self-hosted. Lots of people start off self-
           | hosting but it's typically useful for people with low
           | traffic, or for people whose time is worth less than money,
           | or even those who enjoy it. Because you have to maintain
           | anything you self-host. We like to cater for both.
        
         | GordonS wrote:
         | > No tracking
         | 
         | Personally, I think that Fathom strikes a good balance between
         | privacy and usability, but it does still use tracking (or at
         | least it did when I was looking at it a few weeks back) - the
         | difference is that it uses fingerprinting instead of cookies. I
         | think it's implemented in a privacy-focused way, but it does
         | look like they are ignoring some of the EU ePrivacy guidance,
         | which explicitly states that consent should be obtained before
         | using fingerprinting, even if PII can't be reverse-engineered
         | from the fingerprint.
         | 
         | As I say, I think their implementation makes a lot of sense,
         | and even as a privacy advocate myself I think those particular
         | pieces of ePrivacy guidance focused on fingerprinting is
         | excessive. But the EU doesn't seem to agree.
        
           | JackWritesCode wrote:
           | We're not ignoring the guidance, it's just such a grey area
           | when it comes to PECR / ePrivacy. Even the ICO's guidance, it
           | talks about "cookie-like" technology. Our technology isn't
           | cookie-like. And our processing isn't cookie-like either.
           | We've had lawyers look at our documentation and all of them
           | have said it's a grey area.
           | 
           | You'll know this but some people reading might not: Under
           | GDPR, there are multiple legal bases for processing and we
           | rely on legitimate interest. PECR / ePrivacy is the grey area
           | for us and other services.
           | 
           | Having said all of this, we're fortunately moving away from
           | requiring any compliance at all... by avoiding the
           | complexities all together. We're rolling a refactor to our
           | data collector over the next few weeks, and we won't have to
           | have these conversations about grey areas anymore :) We've
           | hired a top-tier privacy consultant and are going to be
           | deploying a huge update, putting us at the top of the list
           | for compliant analytics. Every single privacy-focused
           | analytics service is in a grey area right now (some think
           | they're not but they are). We will be the first to move out
           | of this GDPR / ePrivacy grey area dance.
           | 
           | As you say, you see the logic behind the implementation we
           | had, but we're dealing with politicians who don't understand
           | the difference between Google Analytics and privacy-focused
           | analytics. And that's fine, the work they've done has lead to
           | better privacy for everyone, so we appreciate them.
        
             | GordonS wrote:
             | > We're not ignoring the guidance, it's just such a grey
             | area when it comes to PECR / ePrivacy. Even the ICO's
             | guidance, it talks about "cookie-like" technology. Our
             | technology isn't cookie-like. And our processing isn't
             | cookie-like either. We've had lawyers look at our
             | documentation and all of them have said it's a grey area.
             | 
             | That sounds like you are trying to pick and choose the bits
             | you want to hear :)
             | 
             | There have been several ammendments since the original
             | ePrivacy guidance. There is at least one such directive
             | that is very explicit about fingerprinting specifically. If
             | doesn't use ambiguous language, it states clearly that
             | consent is required for fingerprinting.
             | 
             | As I said, I personally think it's just bonkers, and I
             | think your service is absolutely in the spirit of the
             | ePrivacy rules. But you can't say the rules on
             | fingerprinting are not clear.
             | 
             | I'm keen to see what you've got coming, as the only way I
             | see to avoid consent is not to associate identifiers with
             | users at all - so each page hit would be a completely
             | independent object. Can you say anything about your plans
             | here?
        
               | JackWritesCode wrote:
               | Like I say, we've had lawyers review our docs. Even the
               | term "fingerprinting" has more nuance to it.
               | Fingerprinting is used as a way to attempt to set a
               | permanent cookie / identify an individual, and their
               | actions. We don't do this.
               | 
               | And we definitely agree that it's bonkers.
               | 
               | I can't say anything here until we've got our press
               | release out.
        
               | GordonS wrote:
               | > Like I say, we've had lawyers review our docs. Even the
               | term "fingerprinting" has more nuance to it.
               | Fingerprinting is used as a way to attempt to set a
               | permanent cookie / identify an individual, and their
               | actions. We don't do this.
               | 
               | Ouch, I kind of wish you hadn't said that, because it
               | sounds like you're straying dangerously close into weasel
               | words and deliberately incorrectly interpretations. Sorry
               | if that sounds harsh, but what I've read is very clear.
               | 
               | As before I like your solution, and I think it's
               | absolutely in the spirit of privacy. But the guidance is
               | really clear here, and gives examples of fingerprinting.
               | Nobody said a fingerprint has to be a _permanent_
               | identifier; as far as I recall, Fathom does use
               | fingerprinting to identify individuals, so that a
               | sequence of page views can be attributed to a single
               | visitor. I understand that those fingerprints include a
               | timestamp, and so are only valid for some time (2 hours,
               | or whatever it is).
        
               | JackWritesCode wrote:
               | Thanks for your input here, Gordon. It doesn't sound
               | harsh at all, you clearly care about privacy regulations
               | and you're trying to help. Ultimately, we had moved based
               | on conversations with lawyers. But as I say, we are
               | rolling out changes this week & next, so it doesn't
               | matter what we think about the regulation :) And thanks
               | again for the challenge.
        
               | GordonS wrote:
               | Thanks for the debate, and I'll be looking out for what
               | you've got coming next!
        
         | Longwelwind wrote:
         | I've been using a similar tool: https://simpleanalytics.com/.
         | 
         | I wish they'd offer more plans between the first 2 cheapest,
         | though. My open-source project is hitting the basic plan limits
         | and the next offer is too expensive for me.
        
           | JackWritesCode wrote:
           | SA is good, and Adrian prices their services responsibility.
           | Fathom charges $24 / month as our 2nd tier and I do believe
           | Adrian should offer a middle tier too. But I know nothing
           | about his business behind the scenes, so I can't comment.
           | Ultimately, you can be confident that he prices his service
           | to be sustainable, which we really respect.
        
         | remux wrote:
         | Some weeks ago I discovered fathom and I am fully satisfied
         | with it.
        
           | JackWritesCode wrote:
           | So glad you love it :)
        
         | JackWritesCode wrote:
         | Thanks James. You wait till we launch V3 ;)
        
           | ksec wrote:
           | What's new for V3?
        
             | JackWritesCode wrote:
             | We're not announcing anything just yet but it'll be our
             | best release to date
        
               | clairity wrote:
               | > "We're not announcing anything just yet but it'll be
               | our best release to date"
               | 
               | not a knock on you or fathom, but it seems like you're in
               | fast-response sales mode here (which is totally fine)...
               | the above is a particularly empty statement. why would
               | any next release not be the best to date?
               | 
               | maybe say it should be an exciting release, which is
               | similarly anticipatory without being meaningless sales-
               | speak.
        
               | JackWritesCode wrote:
               | Good point, heh. We're going to be improving speed of
               | aggregation, real time dashboard, page / ref level
               | metrics, more advanced goals and various other pieces.
        
               | clairity wrote:
               | thanks! in a background thread, i'm on the lookout for a
               | privacy-focused analytics offering. it's for small
               | personal things for now, so leaning toward simple and
               | free, but who knows what the future holds.
        
           | zabana wrote:
           | I have zero use for your product but I just want to say that
           | I love your website ! From the minimal design to the clear
           | and concise copy, to the signup process. It's all
           | frictionless and smooth, you nailed it :)
        
             | JackWritesCode wrote:
             | That means a lot, thank you. We've been running Farhom
             | since 2018, and we've put a lot of thought into the user
             | experience :)
        
       | mhw wrote:
       | If your app is already built with Rails, adding
       | https://github.com/ankane/ahoy is pretty simple. Combine with
       | https://github.com/ankane/blazer and you can build a reasonable
       | set of reports as well.
       | 
       | It's pretty simple to extend too: I've added basic client-side
       | (JavaScript) error reporting on top of it, and I'm thinking about
       | using it for Content Security Policy reporting too.
        
         | etewiah wrote:
         | Yeah, ahoy is pretty awesome! In fact everything by ankane is
         | inkanely great - I have no idea how he manages to be so
         | productive....
        
       | mgreenleaf wrote:
       | Another shameless plug, if you are just using it for finding out
       | where visitors are coming from and page hits, I wrote
       | https://geo-yak.com for that. Doubles as an ip geolocation API.
       | 
       | I'm putting the finishing touches on a `tag=XXX` parameter that
       | allows you to record a tag (like a pageid), and then filter the
       | maps by it (not publicly documented yet, but will be in the next
       | couple weeks).
        
       | franky47 wrote:
       | I'm working on an end-to-end encrypted analytics SaaS, to try and
       | solve the "putting your eggs in someone else's basket" problem,
       | while offering something simpler than self-hosting.
       | 
       | I'm collecting feedback and looking to open it for beta in the
       | next couple of weeks, but there's already a preview signup link
       | in the newsletter, where I share my progress on building the
       | platform on a weekly basis.
       | 
       | https://chiffre.io
        
         | m90 wrote:
         | Out of interest, if you say you honor DNT, how exactly do you
         | handle browsers that do not allow setting this as a user
         | preference anymore (Firefox, Safari)?
        
           | franky47 wrote:
           | Firefox's settings (as of v77) still allow you to set (or
           | unset) DNT, and I honor that. For Safari unfortunately, Apple
           | chose to remove this feature because it could be used for
           | fingerprinting, so there is not much to be done here.
        
             | m90 wrote:
             | Interesting, did Firefox revert their decision towards DNT?
             | I remember being confronted with the behavior of it sending
             | DNT headers no matter what, which must have been something
             | around spring 2019.
        
               | franky47 wrote:
               | Possibly, I'm not using the DNT header though, because of
               | the encryption, I need to know about DNT before the
               | analytics data is even sent. I use navigator.doNotTrack,
               | and where it's set I only encrypt and send a minimal
               | visit count event.
        
               | m90 wrote:
               | I tried to do the exact same thing and had to stop
               | considering DNT as it would essentially collect Chrome
               | only back then. Good thing they have reverted this I
               | guess. DNT makes a lot of sense as a concept still.
        
         | JackWritesCode wrote:
         | We've been building this as an option too. It's a cool feature
         | that a lot of people will like. Good luck
        
       | TomGullen wrote:
       | I still can't see any solid reasons why a site owner would not
       | use GA.
       | 
       | Other products:
       | 
       | - Objectively lack features
       | 
       | - Potentially incur extra costs in money/time
       | 
       | - May be a small barrier in m&a
       | 
       | - May carry additional risks/attack vectors if self hosted
       | 
       | Trying to ween off big tech is commendable, but likely
       | detrimental to a business.
       | 
       | Relatively high risk, low reward.
       | 
       | I'm happy to have my mind changed. I can see a case for user
       | hostility, but most sites I imagine don't have an audience
       | sensitive to this at the moment anyway.
       | 
       | From an idealogical standpoint, other cloud stat tracking
       | services would only function if not many people used them. And I
       | would also imagine feature creep would be inevitable and lead
       | them to becoming an inferior version of GA.
        
         | elondaits wrote:
         | GDPR compliance.
         | 
         | GDPR is the European privacy law. It protects European citizens
         | so it applies not only to European companies but any company
         | that does business in Europe (having offices or
         | advertising/selling there).
         | 
         | Google does not give much assurance regarding their GDPR
         | compliance... their text on that subject is mostly CYA and then
         | they make it your responsibility to decide how to use it in
         | compliance (if at all possible).
         | 
         | The GDPR gives you a small window to count visitors through
         | cookies as long as all private information (even IP) is
         | anonymized... OR you can go do a more traditional tracking with
         | their explicit agreement. This last use case is completely
         | useless in terms of visitor statistics, but analytics companies
         | sometimes dare suggest it (as in "this is the way to do things
         | right... so our product is compliant and it's not our
         | responsibility if you break the law").
         | 
         | That aside, I run international non-profit sites and GA is a
         | bad look... and with good reason: Using social network sharing
         | buttons, GA, CDNs, etc. gives too much power to track people to
         | a few companies.
        
           | tannhaeuser wrote:
           | > _GA is a bad look_
           | 
           | Totally agree, but are there "acceptable" CDNs, like unpkg?
           | What about Google Fonts?
           | 
           | > _The GDPR gives you a small window to count visitors
           | through cookies as long as all private information (even IP)
           | is anonymized._
           | 
           | If it's not too much to ask, could you expand on that a bit,
           | or share a link? I guess you mean it's ok to send a browser
           | fingerprint for unique visitor stats without having to ask
           | for permission, but I'm not aware of any legal debate let
           | alone court decision with respect to that.
           | 
           | Edit: obviously I can't read ("through cookies"), but cookies
           | for unique visitor counts aren't "functional" are they, so my
           | interpretation is that those cookies need consent; I'd love
           | to hear otherwise though
        
           | M2Ys4U wrote:
           | >GDPR is the European privacy law. It protects European
           | citizens
           | 
           | The GDPR does not discriminate based on citizenship.
           | 
           | It applies if the organisation providing the service is in
           | the EU/EEA* OR if the user of the service is in the EU/EEA*
           | (to the extent that the data reference their activity in the
           | EU/EEA _).
           | 
           | _ And the UK, but thanks to brexit there 's a parallel UK
           | GDPR in place so... take that in to account.
        
         | XCSme wrote:
         | Some issues with GA version going self-hosted: - Privacy of
         | your users: For a specific user, Google knows all the website
         | he visits - Privacy your data: If Google knows the visitors of
         | most websites, your competitors can leverage that advantage
         | (using Google Ads for example) to steal your potential
         | customers. - Google Analytics is bloated and slow (both in
         | terms of the tracking script and the dashboard UI, where it
         | takes several seconds for each graph/page to load). - You don't
         | own your data, at any point Google can, even though unlikely
         | to, block your account (for breaking ToS of some other service
         | of theirs) and you lose all your data. - If everyone uses GA,
         | it will become (already is) an analytics monopoly, which has
         | many other drawbacks (lack of innovation for example).
         | 
         | I do think that for the average user, using GA might be fine
         | because it's free, easy to set-up and does its job. That is
         | unless they care about all the possible consequences.
        
         | Doctor_Fegg wrote:
         | If your site actively competes with a Google product, you might
         | not want to give them access to your user data.
        
       | epoch_100 wrote:
       | It's great to see more alternatives to GA, and to see those
       | alternatives getting attention.
       | 
       | For those interested, one other FOSS analytics tool is Shynet
       | [0]. Modern, privacy-friendly, and detailed web analytics that
       | works without cookies or JS. It also looks pretty slick.
       | Disclosure: I'm a maintainer.
       | 
       | [0] https://github.com/milesmcc/shynet
        
       | dsalzman wrote:
       | I switched to GoatCounter for my personal blog and it's more than
       | capable. All I want is pageviews with timestamps per page and
       | referrer info.
        
       | FalconSensei wrote:
       | Didn't know about GoatCounter. I think its the only free hosted
       | alternative that I saw.
       | 
       | Having a small static blog hosted on GithubPages, GA was the only
       | option for me. (Not going to pay for analytics while my blog has
       | like, 10 visits a week)
        
         | abelaer wrote:
         | I just installed goatcounter on my githubPages Jekyll page. 2
         | minutes of work, works great.
        
       | cpuguy83 wrote:
       | Maybe stop spying on people?
       | 
       | EU be like you have to put up this banner to tell them you are
       | spying... it super annoying and everyone hates it... and you be
       | like "sure, I love me some spying".
        
         | dynamite-ready wrote:
         | How do you define 'spying'? It's near impossible to invent or
         | discover almost anything, without something tangible to
         | observe.
         | 
         | I suppose the purpose of why certain collections of data are
         | put together is where all the anger rightfully comes from, but
         | calling for a complete armistice on all (even innocuous) forms
         | of data collection is a little churlish.
         | 
         | You do it in microcosm too, you know. How many pictures do you
         | have on your phone, that contain people you don't know?
        
       | cfitz wrote:
       | If you're using Ruby on Rails, perhaps consider the "Ahoy Matey"
       | AKA "Ahoy" free and open source gem created by Instacart [1].
       | 
       | I've used it in my personal projects and have never had any
       | issues. It's great to have no vendor lock-in and full ownership
       | of user metric data. If you go this route, please be responsible
       | with the data and follow all relevant regulations & guidelines
       | (ex: GDPR) regarding its storage and usage.
       | 
       | [1]: https://github.com/ankane/ahoy
        
       | srg0 wrote:
       | TIL about European Union Public License:
       | 
       | https://joinup.ec.europa.eu/collection/eupl/introduction-eup...
       | 
       | OSI-certified, copyleft, non-viral, GPL-compatible, SaaS-aware,
       | multilingual
        
         | Hitton wrote:
         | I'm kinda confused about that, because
         | https://www.gnu.org/licenses/license-list.html#EUPL-1.2 seems
         | to say that it's possible to relicense the source to GPL which
         | would go directly against Goatcounter's author who apparently
         | wanted AGPL-ish license without ideological fluff.
        
           | Carpetsmoker wrote:
           | GoatCounter author here: yeah, that's not perfect; this also
           | came up in the HN discussion for the article a while ago[1].
           | I've been in touch with one of the authors of the EUPL since
           | and the short of it is that they don't really think it's an
           | issue.
           | 
           | I've thought about this for quite some time, and decided I'll
           | use a slightly modified version of the EUPL which removes GPL
           | from the compatible license appendix. Just haven't gotten
           | around to that for no reason in particular.
           | 
           | [1]: https://news.ycombinator.com/item?id=21914245
        
         | swyx wrote:
         | how can it be both copyleft and non viral? isn't vitality a
         | definitive feature of copy left?
        
           | icebraining wrote:
           | No, see LGPL and MPL for example.
        
           | contravariant wrote:
           | > it has no "viral effect" in case of linking
        
             | sc11 wrote:
             | Neither does the GPL or any other licence under European
             | laws.
        
               | contravariant wrote:
               | That's interesting, wouldn't that mean that the licenses
               | themselves aren't in some sense 'legal' in the EU? How
               | does the GPL prevent this from invalidating down the
               | whole license?
        
               | icebraining wrote:
               | That's surprising, considering the virality is part of
               | the license text, not the law; does the law prohibit that
               | clause?
        
               | sc11 wrote:
               | Here's a good explanation:
               | https://joinup.ec.europa.eu/collection/eupl/news/why-
               | viral-l...
               | 
               | The short answer is that there are certain protections
               | that ensure interoperability, and that linking to
               | software does not make it a derivative work.
        
               | twic wrote:
               | That's an interesting analysis.
               | 
               | That directive is usually understood to be about reverse
               | engineering in order to build compatible software: "to
               | obtain the necessary information to achieve the
               | interoperability of an independently created program with
               | other programs" being a key bit.
               | 
               | It's not immediately clear to me - a programmer but not a
               | lawyer - that this has any bearing on whether linking
               | creates a derivative work.
               | 
               | Have any other experts, or courts, weighed in on whether
               | this analysis is sound?
        
       | XCSme wrote:
       | I can also add mine, even though more complex, it's still
       | lightweight: https://usertrack.net
       | 
       | I tried bringing together the most useful analytics features
       | (user segments, heatmaps, session recordings, tags/events) in a
       | self-hosted platform with simple UI. A/B testing feature is also
       | coming soon. I built the platform with the optimal use-case being
       | improving conversion rates on landing pages.
       | 
       | My goal now is to prove and teach (even to non-technical users)
       | that self-hosting is easy nowadays when you can create a VPS
       | running your desired software in just a few clicks.
       | 
       | I would love to hear some criticism or why you wouldn't want to
       | try something like this.
        
       | xrd wrote:
       | Has anyone used any of these with a proxy to avoid ad blocker
       | blocking? What I mean is, I installed matomo and then saw my ad
       | blocker blocked it. Is there a way to make any of these work by
       | proxying through the same domain as the site, so those analytics
       | requests look just like all other ajax requests?
       | 
       | I was surprised matomo wasn't listed here. Does anyone know if
       | that was intentional? Seems like it fits the criteria of the post
       | and the goals of open source.
        
         | JackWritesCode wrote:
         | Absolutely. We recommend custom domains
         | (https://usefathom.com/support/custom-domains) but you could
         | proxy through to the collector, no problem. It's a great way of
         | doing it
        
       | nofunsir wrote:
       | Here's the most lightweight alternative to Google Analytics:
       | 
       | Don't use analytics. You really don't need it. No. You really
       | don't. No, No. I promise you. Just stop.
       | 
       | All tracking is evil. All ads (except those inside a store for a
       | product inside the same store) are evil.
        
       | pachico wrote:
       | We run our own analytics solution based on a js library, a small
       | go app and ClickHouse for data aggregation. With a very cheap and
       | small setup you can handle hundreds of millions of events per
       | day.
        
       | sjwright wrote:
       | I run a reasonably large website and about two years ago it
       | dawned on me that I _never checked Google Analytics._ It was
       | completely useless. It wasn 't telling me anything useful. I also
       | knew that it was marginally user hostile (or at least perceived
       | as such) and affecting page performance, even if only slightly.
       | 
       | Removing it felt momentous and insane. But in November 2018 I
       | finally plucked up the courage and removed it. The crazy thing
       | is, until this article appeared on the top of Hacker News
       | reminded me, I had completely forgotten that I had removed it.
       | Far from the world ending, it turned out to be the most
       | inconsequential thing imaginable.
       | 
       | (I remember pouring over web server logs in Analog and AWStats
       | 15+ years ago. Now I honestly can't remember why. I think it was
       | some combination of vanity... and because everyone else was doing
       | it. I suspect for most web developers GA was just the natural
       | evolution of that muscle memory.)
        
         | JackWritesCode wrote:
         | GA and AWStats are both awful products for a lot of people. For
         | us, we check out Fathom dashboard daily to see referrers and
         | popular content. And vitality (right now we can see a ton of
         | traffic coming from HN). When I used GA, I never checked it.
        
           | sjwright wrote:
           | I've looked at many reporting tools, most of them are
           | probably great for corporate/enterprise stuff.
           | 
           | I'm self-employed, so I have no boss or shareholders that
           | need pretty reports with bar charts. In my case my site is
           | deeply database driven and I can build engagement statistics
           | directly from real data using complex SQL queries.
           | 
           | And while there's only a few such 'reports' that I check
           | regularly, most of them are temporally incongruous--I think
           | that's how you'd describe it--in that they look at what
           | happened in the past contextualised by what's known in the
           | present. (E.g. tracking engagements from new/irregular users,
           | while they were new/irregular users, but which subsequently
           | became regular users.)
        
             | JackWritesCode wrote:
             | Well that's a different story then. It sounds like you
             | measure things your own way, so I agree that analytics are
             | pointless in your edge case.
             | 
             | For us, we have generated a lot of revenue by measuring
             | what works and what doesn't. That's why analytics are worth
             | it for a lot of people.
        
           | zoomablemind wrote:
           | > GA and AWStats are both awful products for a lot of people.
           | 
           | Just wonder, what's awful about AWStats?
           | 
           | Sure it's dated, and "analog" in a way that it's log-based,
           | not JS. But it does not send the tracking to the third party,
           | can be used offline.
        
           | geerlingguy wrote:
           | Fathom being so quick to load and simple to use, I glance at
           | it here and there. I had given up on trying to find a good
           | 'light' way to navigate Analytics.
        
       | gorkemcetin wrote:
       | Countly [1] is another open source alternative to Google
       | Analytics - suggest you try it on Digital Ocean [2] or deploy on
       | your own [3].
       | 
       | It is self hosted, has support for desktop apps, mobile apps and
       | web apps at the same time.
       | 
       | [1] https://count.ly
       | 
       | [2] https://marketplace.digitalocean.com/apps/countly-analytics
       | 
       | [3] https://github.com/countly/countly-server
        
         | mrpeker wrote:
         | Some of my friends switched from GA to Countly. They are very
         | satisfied, and I am thinking of using it in my next project.
        
         | scoutt wrote:
         | Interesting. Thanks. But I am not so sure about disabling
         | SELinux:
         | 
         | > Disable SELinux on Red Hat or CentOS if it has been enabled.
         | Countly may not work on a server where SELinux is enabled. In
         | order to disable SELinux, run "setenforce 0".
         | 
         | https://support.count.ly/hc/en-us/articles/360036862332-Inst...
        
           | snuxoll wrote:
           | Unfortunately common on projects like these. Instead of
           | guiding admins on how to properly configure SELinux it's
           | easiest to just throw your hands up and say "disable it".
        
       | ksec wrote:
       | Is nice to see Plausible gaining traction. Here is an blog post
       | [1] about how they were asked for using it on site with tens or
       | hundreds of million page view.
       | 
       | I am wondering if HN is interested in hosting analytics like
       | plausible that is open for us to see. Sometimes I do wonder how
       | many page view do HN get per day, where are we all from etc. For
       | example the plausible demo site. 35% are using macOS. But only
       | 15% uses Safari.
       | 
       | [1] https://plausible.io/blog/april-2020-recap
        
       | Kjeldahl wrote:
       | I was recently looking for a good tool that supports both web
       | site analytics and app analytics (custom events, typically pushed
       | by SPAs). I looked at GA, Amplitude and finally Matomo (which I
       | ended up with). GA and Amplitude either did not offer or made it
       | hard to work down to the micro level, essentially tracking known
       | individual users down to the singular event level. Matomo makes
       | this easy, although it certainly looks a bit dated compared to
       | the competition. And the free parts are somewhat limited (you
       | need to buy stuff or hosting).
       | 
       | I would have though that there would be several decent packages
       | offering www + app analytics by now, but as I wrote, options were
       | quite limited. Some of the options mentioned in the subject here
       | looks like good options for just website analytics, but I'm not
       | seeing much as far as "app analytics" (custom events) goes.
        
         | srrr wrote:
         | There are many packages listed at
         | https://github.com/onurakpolat/awesome-analytics . Heap is an
         | example of macro+micro+web+app.
        
           | Kjeldahl wrote:
           | Thanks for the tip. One of Heap's selling points seems to be
           | that tracking events "manually" is over, everything is
           | automatic. That might work if all "work" is defined as "stuff
           | users do". For other types of "work" (calculation pipelines,
           | job delegation etc) I'm sure being able to "micro manage"
           | events can be useful. But sure, my use case might be
           | different.
        
         | JackWritesCode wrote:
         | Less companies are focusing on user level tracking, as it's an
         | invasion of privacy and compliance doesn't allow it
        
           | Kjeldahl wrote:
           | And I'm sure that makes sense if you have lots of users and
           | low revenue per customer. If your use case is the opposite,
           | tracking individual usage becomes more important. At least
           | until you have lots of those users. After that, who cares! :P
        
           | srrr wrote:
           | Less companies are focusing on user level tracking because
           | one single user is not a meaningful statistical group.
           | 
           | Companies focusing on user level tracking today provide a
           | different set of tools one might be used to and that can of
           | course be compliant, see https://www.hotjar.com/.
        
       | Fileformat wrote:
       | I have been using GA for my side projects but have been unhappy
       | with the Google's direction on privacy, so started researching
       | others. There are just _so_ many: I think a lot of developers
       | (including myself) think it is easy to do  & start rolling their
       | own & then try to productize it.
       | 
       | Here is my research: https://til.marcuse.info/webmaster/alt-
       | analytics.html
       | 
       | I ended up going with GoatCounter.
        
       | Bogdanp wrote:
       | Shameless plug: I wrote and use nemea[0] for all my stuff.
       | 
       | [0]: https://github.com/Bogdanp/nemea
        
       | bad_user wrote:
       | I have my own self-hosted Matomo instance [1].
       | 
       | Via Docker & docker-compose it's quite easy to install and keep
       | up to date and Matomo is open source, well maintained, very well
       | behaved and pretty hands off.
       | 
       | And I configured it on my websites with cookies turned off [2]
       | and with IP anonymization [3]. In such an instance you don't need
       | consent, or even a cookie banner, because you're not dropping
       | cookies, or collecting personal info. Profiling visitors is no
       | longer possible, but you still get valuable data on visits.
       | 
       | Note that if you want to self-host Matomo, you don't need more
       | than a VPS with 1 GB of RAM (even less but let's assume
       | significant traffic) so it's cheap to self host too.
       | 
       | And I disagree with another commenter here saying Analytics is
       | just for vanity. That's not true -- even for a personal blog
       | analytics are useful to see which articles are still being
       | visited and thus need to be kept up to date, or in case content
       | is deprecated, the least you could do is to put up a warning.
       | 
       | And if you write that blog with a purpose (e.g. promoting
       | yourself or your projects) then you need to get a sense of how
       | well your articles are received. You can't do marketing without a
       | feedback loop.
       | 
       | [1] https://matomo.org/
       | 
       | [2] https://matomo.org/faq/general/faq_157/
       | 
       | [3] https://matomo.org/docs/privacy/
        
         | boromi wrote:
         | How do you actually host this? Do I need a VPS or is more of
         | like a Heroki thing?
        
           | johnchristopher wrote:
           | You can easily set it up on a VPS with docker, check github
           | for instructions.
        
             | boromi wrote:
             | Right, I found the official docker image
             | https://github.com/matomo-org/docker
             | 
             | But I was hoping for a simple DIY guide for setting
             | everything up? I mean with Google Analytics a dummy can set
             | it up very quickly. I know it'll take more work with
             | matomo, but I need a little more details then just "use
             | docker".
             | 
             | I know I need a VPS, something like Digital Ocean.
             | 
             | I know I need the matomo docker image.
        
               | johnchristopher wrote:
               | Sorry for misleading you. You don't need docker to run
               | matomo (docker makes it convenient to install matomo -
               | especially if you are running other containers on your
               | server - but there are shortcomings, mainly the container
               | setup).
               | 
               | Matomo is just a set of php files. Upload them to any php
               | hosting with a mysql database, in a matomo or
               | matomoanalytics folder and point your browser to your
               | domain name/matomo/ and the install setup should begin.
               | 
               | If you have 0 experience with docker and just needs
               | matomo then forget about docker and start from the php
               | file with a standard php/mysql host.
        
               | XCSme wrote:
               | I am not familiar with matomo, but aren't those
               | instructions enough? You go to DO, create new droplet
               | from that image, and done? I assume once you have it
               | installed you will get more info on how to add the
               | tracker on your site from their interface ?
        
               | snuxoll wrote:
               | Matomo needs a database server (MySQL) and a way to
               | execute cronjobs - that's about it. There's some gotchas
               | if you're running in a HA setup or if the database runs
               | on a separate server.
               | 
               | I have a full deployment of Matomo in Kubernetes on
               | gitlab [1] if anyone would find it useful (includes
               | correct settings for running in multiple pods).
               | 
               | 1: https://gitlab.com/pcgamingwiki/webanalytics/-/tree/ma
               | ster/
        
               | xorcist wrote:
               | I haven't used the docker image, but Matomo is PHP in its
               | simplest form.
               | 
               | You unzip it. Click through a setup wizard. Done.
               | 
               | (I also run in readonly noexec in an fpm chroot but
               | that's not necessary.) I set it up for a couple of
               | clients since the Piwik days and it's been pretty much
               | set and forget, apart from the occasional upgrades.
               | 
               | There are plenty of advanced functionality which probably
               | few people understand and use. For my personal projects I
               | am fine with log analytics which I mostly use goaccess
               | for.
        
         | Carpetsmoker wrote:
         | > And I disagree with another commenter here saying Analytics
         | is just for vanity. That's not true -- even for a personal blog
         | analytics are useful to see which articles are still being
         | visited and thus need to be kept up to date, or in case content
         | is deprecated, the least you could do is to put up a warning.
         | 
         | Some examples: I maintained a Vim ChangeLog for a while (which
         | is quite some work), and turned out no one was reading that, so
         | ... why bother?
         | 
         | In another case, I wrote an article about "how to detect
         | automatically generated emails" and I thought it wasn't
         | actually that interesting and no one read it so considered
         | archiving it, but turned out quite a few people end up there
         | through Google searches etc. and I ended up updating it instead
         | of archiving it, as it was clearly useful to people.
        
         | thomasahle wrote:
         | > And I configured it on my websites with cookies turned off
         | [2] and with IP anonymization [3].
         | 
         | Do you have a way to filter out your own visits in this case?
         | On small pages I find that my own clicks and events during
         | testing contaminates the statistics.
        
         | GordonS wrote:
         | > And I configured it on my websites with cookies turned off
         | [2] and with IP anonymization [3]. In such an instance you
         | don't need consent, or even a cookie banner, because you're not
         | dropping cookies, or collecting personal info. Profiling
         | visitors is no longer possible, but you still get valuable data
         | on visits.
         | 
         | Does this mean each page hit cannot linked to be any other? For
         | example, can I see that a visitor viewed a particular sequence
         | of pages?
        
           | isiahl wrote:
           | Just spitballing here but could you use the Referrer header
           | to track sequences of pages
        
         | mritchie712 wrote:
         | Our SaaS runs on Google App Engine and sending the logs to
         | BigQuery only takes a couple clicks[1]. From there you can
         | write SQL to summarize the data by referrer, page viewed, etc.
         | Here[0] is a starting point, though you'll need update the
         | `WHERE` clause so it works for your use case.
         | 
         | You get IP and user agent in those logs if you want to roughly
         | track visit to conversion metrics.
         | 
         | 0 - https://gist.github.com/mike-
         | seekwell/83ac75c82a943e287a7abe...
         | 
         | 1 -
         | https://cloud.google.com/appengine/docs/standard/python/logs
        
         | chrismorgan wrote:
         | I self-hosted Matomo for a year and a half (and took over the
         | AUR package for it and improved it in the process). It was no
         | trouble to run, but I ended up uninstalling it late last year,
         | for a few reasons: its interface is painfully slow (and that's
         | nothing to do with my 1GB/1 vCPU VPS--I've interacted with a
         | decent-sized instance at innocraft.cloud and it was similar),
         | and I seldom looked at it, and I couldn't think of any way in
         | which anything I found in the analytics would change my
         | behaviour, and server-side analytics are good enough (better on
         | some ways, worse in others), and I value speed. So all up, I
         | figured: why am I slowing all my users down with this 50KB of
         | JavaScript (of which I frankly need less than 1KB), and why am
         | I keeping this software going?
         | 
         | So now I pull out GoAccess (which reads the server logs) from
         | time to time. I find that my Atom feed is the vast majority of
         | _traffic_ to my site, which Matomo couldn't tell me. I should
         | implement pagination on the feed and see if that helps. (Or
         | limit the number of items in the feed, but conceptually I
         | rather like everything being accessible from the feed. Wonder
         | how many feed readers support pagination?)
        
           | bad_user wrote:
           | My websites are behind Cloudflare. If not Cloudflare then I'd
           | use another CDN. Therefore I don't have logs.
           | 
           | Also I disagree about the slowness.
           | 
           | The script is loaded asynchrously, it does not block the page
           | and I measure my loading times, which are really good
           | actually. Just did a measurement and my front-page loads in
           | 271 ms and this includes all network requests, including
           | Matomo.
           | 
           | I don't think this is a real concern, but rather a premature
           | optimization. If GoAccess works for you, great, but that's
           | not something I can use due to CDN.
        
             | markdown wrote:
             | > Just did a measurement and my front-page loads in 271 ms
             | and this includes all network requests
             | 
             | Is your audience just the people in your locality, or the
             | entire world?
        
             | chrismorgan wrote:
             | The painfully slow interface I'm speaking of is Matomo's
             | app, the part you as the site administrator look at; not
             | piwik.js or whatever they call it now.
             | 
             | But since you've raised the script part, 50KB of JS loaded
             | from a new host is perhaps surprisingly much work,
             | especially on slower devices. I find the difference between
             | running no JavaScript at all and running Matomo's client
             | script, _even asynchronously_ , to be _easily_ visible.
        
           | forgotmypw17 wrote:
           | Is pagination supported in feeds?
           | 
           | It would be very useful. Like you said, it's nice when
           | everything is accessible from the feed.
        
             | chrismorgan wrote:
             | <link rel="next" href="..."/>
             | 
             | Deliberate semantics were defined for this as part of
             | AtomPub, https://tools.ietf.org/html/rfc5023#section-10.1
             | (before that, it made sense that it would mean this because
             | of the relations registry, but nothing had been defined).
             | It's clearly applicable to Atom syndication in general, but
             | it's definitely more useful to AtomPub. I have no idea how
             | wide client support is.
        
           | the_gipsy wrote:
           | Exact same reasoning here. I was burdening my users with
           | slower load times, for something that didn't ever impact my
           | "product" decisions. For any meaningful analysis, I always
           | pulled up server side logs of things.
           | 
           | So I changed to GoAccess too. I don't check it too often,
           | just when I want to see the impact of some spam/publicity
           | posting around.
        
       | djsumdog wrote:
       | I briefly tried Matomo, but didn't want the Javascript component
       | and really just wanted log analysis. It's okay at log analysis,
       | but it doesn't really shine unless you do javascript live
       | tracking.
       | 
       | So I disabled it and went back to awstats. I've been using
       | awstats for over a decade, and for my personal site and projects,
       | it pretty much gives me the majority of the data I really care
       | about.
       | 
       | I might look at shipping more complex nginx json logs to
       | logstash/elastic search, but then I'd need to visualize them in
       | Kibana and that just seems like a lot of heavy weight containers
       | to run for stats I don't really need.
        
       | darekkay wrote:
       | Related submissions:
       | 
       | - https://news.ycombinator.com/item?id=19883876
       | 
       | - https://news.ycombinator.com/item?id=21890027
       | 
       | - https://news.ycombinator.com/item?id=22813168
       | 
       | - https://news.ycombinator.com/item?id=23411047
        
       | runxel wrote:
       | Still no real alternative when on a Github page, or am I missing
       | something?
       | 
       | It's not that I really _need_ statistics, but sometimes it would
       | be nice to know if there is even _anything_ going on or you 're
       | just screaming into a void.
       | 
       | But I refuse to spam visitors of my pages with GA.
        
         | zoomablemind wrote:
         | The GH pages are still in a repo, so the repo's Insights:
         | Traffic can show some stats on visits.
         | 
         | https://github.blog/2014-01-07-introducing-github-traffic-an...
        
       | jwr wrote:
       | I turned off Google Analytics, because I realized that it doesn't
       | actually report any useful or actionable data, just vanity
       | metrics, and many of them of dubious quality.
       | 
       | I run a SaaS and what matters for me is paid subscriptions.
       | "Visits" (even if by humans, which is hard to tell) really do not
       | matter much. Yes, I do want to increase conversion rates, and run
       | bandit experiments, but I'm better off doing that myself.
       | 
       | What also matters are search terms, but Google's search console
       | (or tools, or whatever it's called this week) provides that.
       | 
       | Turning off Google Analytics was hard to do psychologically --
       | the Fear Of Missing Out is strong. But it turns out I'm not
       | missing out on anything, except some dubious vanity data. And I'm
       | making the web a better place in the process.
        
         | srrr wrote:
         | Analytics may give you the number of paid subcriptions but that
         | is not the reason it exists. Modern analytics systems are build
         | to give you a statistical view of all events/interactions
         | leading up to a conversion or, more important, to a conversion
         | that did not happen. With this information it is possible to
         | optimize all touchpoints a user has with your services.
         | 
         | It is easy to track all aspects of subscriptions including
         | recurring revenue. The data quality depends on the data you
         | send to google analytics and is not "dubious" but in your own
         | responsibility. And google analytics is really good in
         | separating bots from human visits.
         | 
         | If google analytics did only report vanity metrics to you, you
         | most likely did not use it the right way. Maybe you missed the
         | segmentation tools to find groups of users for whom your
         | service did not work out?
        
           | jwr wrote:
           | > google analytics is really good in separating bots from
           | human visits
           | 
           | That certainly wasn't my experience.
           | 
           | But more generally: I optimize touchpoints using automated
           | Bernoulli bandits with multiple variants. And I track the
           | metrics that really matter (like signups, MRR, churn, etc)
           | very, very carefully. My point was that just adding GA
           | doesn't bring much value, and makes the web worse.
           | 
           | Another example of how page visits don't matter: it's easy to
           | get a huge spike of HN users clicking through. But if my SaaS
           | has no relevance to HN users (except as a technical
           | curiosity), this doesn't matter at all. It won't change my
           | revenue, so it's irrelevant.
           | 
           | Unless you run a site with ads, focusing on page visits
           | doesn't make sense: it's like measuring the performance of a
           | supermarket and getting excited about increasing traffic on a
           | nearby highway. Could it influence your sales? Possibly. Is
           | it actionable? Nope.
        
             | jerven wrote:
             | GA used to be good in separating the bots from the humans,
             | but now I don't think so. I see a huge number of visitors
             | from Dublin that seem to be AWS sourced bots (and Asburn,
             | both have bounce rates like no other city). Millions of
             | visits, from a place where we would expect thousands.
        
               | cosmie wrote:
               | GA's "Filter bots" setting simply applies the IAB bot
               | filtering list[1]. The resources required to operate a
               | page rendering bot[2] used to be a high enough bar that
               | the ones doing so were large actors and would make their
               | way onto that list pretty quickly.
               | 
               | The IAB list is still a helpful baseline (if overpriced
               | if you want to lease the list itself[3]), but all it's
               | doing is applying suppression based on things like known
               | bot IP ranges and user agents. It's far less effective
               | than it used to be since it's so incredibly easy and
               | cheap nowadays for anyone with an interest to spin up a
               | rendering bot. Now you've got to supplement it actively
               | with your own set of filters and heuristics if you really
               | want to get rid of bot traffic polluting your data.
               | 
               | [1] https://iabtechlab.com/software/iababc-international-
               | spiders...
               | 
               | [2] Scraping bots were cheap and easy to run before, but
               | didn't render the GA javascript code so never showed up
               | in analytics. It's only bots that use something like a
               | headless browser to render the page that show up in GA,
               | and those have only become commoditized and cheap/easy
               | relatively recently.
               | 
               | [3] Google applies the IAB list to your traffic for free
               | if you check the setting for it, but if you want to use
               | the IAB list yourself you have to pay $4k - $14k annually
               | to lease it from IAB.
        
         | timothy-quinn wrote:
         | So far GA is answering two important questions for me - which
         | marketing strategies are actually working (because it's hard to
         | tell when you've got multiple going at once), and also making
         | sure my marketing is actually hitting the geo-regions I need it
         | to.
         | 
         | That said though once I know which marketing tools are
         | effective, there's nothing more that GA does that CloudFlare
         | couldn't just tell me anyway (i.e. am I getting more or less
         | traffic) and I'll probably drop it as it's one less dashboard
         | to look at - like you said that conversion to subscriber _is_
         | the ultimate metric for success.
        
         | mrweasel wrote:
         | > I realized that it doesn't actually report any useful or
         | actionable data
         | 
         | The actionable part never occurred to me, but makes so much
         | sense. What action could anyone really take, based on the data
         | presented by Google Analytics? On top of my head I can actually
         | think of anything you could easily get from server logs.
        
           | tyingq wrote:
           | Well, you don't specifically need GA for it, but tracking
           | referers to sales, for example...is often actionable. Similar
           | for a funnel view to see where visitors drop off. Especially
           | combined with some A/B testing.
        
           | halflings wrote:
           | Examples I saw in a website I launched last year: * High
           | bounce rate showed that people did not find the content
           | engaging / not what they were looking for, confirming a
           | hypothesis I had; this was particularly true for certain
           | sections of the website. * Time spent on pages was another
           | metric that was helpful to find problems in some sections. *
           | Easy slicing of traffic sources by referrer and country,
           | showed me that most of my "good" traffic was coming from
           | facebook, so I invested more time there.
        
             | srrr wrote:
             | To optimize the bounce rate metric even further a funnel
             | can be created and measured. This allows to see bounces for
             | each part of the funnel up to your conversion(s). It is
             | often really valuable to see how a change on your website
             | improves one part of the funnel but impairs another one,
             | and than to think about the reason why this happens.
        
           | srrr wrote:
           | There are millions of possible answers to this question...
           | Examples:
           | 
           | - Conversion rate grouped by browser and browser version to
           | find browser specific bugs. - Total revenue per user per
           | marketing channel / search keyword / ... to optimize budget
           | allocation. - Revenue by mobile OS version share to decide
           | testing procedures to not optimize for users that don't
           | contribute to your bottom line. - Client side loading times
           | per user location and provider to optimize infrastructure
           | placement. - ...
        
           | chaosite wrote:
           | One of the reason Google Analytics (and logging solutions
           | like it) is popular is that people don't have access to
           | server logs.
        
             | f0rfun wrote:
             | You mean unpopular?
        
               | mtmail wrote:
               | Popular. Because adding lines of javascript is much
               | easier than processing server log files.
        
             | JackWritesCode wrote:
             | Server side logs aren't an accurate way of doing analytics
             | and bring up compliance challenges (storing access logs and
             | using them for purposes other than security etc.)
        
               | hackernewsn00b wrote:
               | Can you talk a little bit more about why server side logs
               | aren't accurate for analytics? I was planning on doing
               | basic analytics (pageviews mostly) using Cloudfront
               | access logs stored in an S3 bucket and queried through
               | Athena for a site I'm working on which uses the AWS
               | stack.
               | 
               | I know Cloudfront logs can sometimes drop, but is there a
               | more important reason you're talking about?
        
               | XCSme wrote:
               | I think with server-side only is harder to filter bots,
               | crawlers and get accurate bounce rates or session length
               | times.
        
           | TomGullen wrote:
           | I find time spent on page is a great way to measure
           | performance of redesigns on page. There are numerous
           | actionable points.
        
             | 72deluxe wrote:
             | Hook into onbeforeunload and call your own javascript to
             | post back to your own site to let you know when they've
             | left.
        
               | TomGullen wrote:
               | I know how I'd do this, but have no reason to do it.
        
             | jwr wrote:
             | It might be, assuming two things:
             | 
             | 1) That you actually care about this metric. I don't, I do
             | not get paid by the number of minutes spent on pages, I get
             | paid by the number of signed-up subscribers who use my
             | software to make their workflow easier. I can (and prefer
             | to) use bandit testing to measure the performance of
             | redesigns.
             | 
             | 2) that it can be reliably measured, which I don't think it
             | can.
        
               | Arnt wrote:
               | Tom Gilb once said that "everything can be measured so
               | accurately that the result is better than having no
               | measurement at all" or words to that effect. I'm sure he
               | phrased it better. In this case, some users aren't
               | measured, which drags down the accuracy of the
               | measurement.
               | 
               | Even if the numbers are off by quite a few per cent, I
               | can easily see how some site operators might benefit from
               | knowing e.g. that visitors close one tutorial much
               | quicker than the others.
        
               | TomGullen wrote:
               | We are a SaaS company also, and time spent reading
               | manual/tutorial pages is important to us.
               | 
               | With regards to point two, being reliably measured sounds
               | to me like perfection is the enemy of adequate. Perhaps
               | in low volumes you can't measure certain stats like this
               | reliabily but in large volumes I think it's useful.
        
               | srrr wrote:
               | As long as your measurement is a random sample everything
               | is okay. Even if it is not, it is much more information
               | than you had before. You just need to keep it in mind
               | when evaluating conclusions. We are not talking about
               | drug tails here and no one dies if the measurement is not
               | 100% accurate.
               | 
               | I am able to measure everything 100% accurate. But this
               | is really really expensive. It's a trade-of.
        
             | XCSme wrote:
             | Unless your page is google.com, where more time spent on
             | page means the experience is worse, not better.
        
               | TomGullen wrote:
               | Context is important, and it's one metric amongst others
               | that should be used.
               | 
               | The point is, there are clearly actionable points GA can
               | offer. It's disingenuous to think there are not.
        
               | XCSme wrote:
               | I agree, saying that there are no actionable points in GA
               | is like saying analytics in general are useless.
        
               | boomlinde wrote:
               | Counter example: I search for information on a subject. I
               | get many interesting and relevant results and therefore
               | spend more time on the page.
        
               | XCSme wrote:
               | This makes it even more obvious that a longer or shorter
               | session length is not clearly better or worse, it depends
               | on a lot of other things.
        
         | bonestamp2 wrote:
         | The main product I work on does not use google analytics, but
         | we do use mixpanel to see what our paid users are actually
         | doing with the SaaS. We actually don't care who each user is,
         | we just want the aggregate data. We believe this data is
         | important for retaining paid subscriptions and attracting new
         | ones. Let me explain.
         | 
         | There's one feature that 100% of our subscribers use. I mean,
         | it's the main thing we do and everyone who subscribes needs
         | that function. Even without analytics, we know that we have to
         | continually make that feature faster and smarter to stay ahead
         | of our competition. We know that if we fall behind our
         | competitors, our subscriptions will dwindle.
         | 
         | But, then we have a bunch of other features that help customers
         | solve some edge case problems around that main thing. It's very
         | important for us to know which of those functions are being
         | tried, used, and reused (or not). Not all customers use these
         | features, but some are tools we have that nobody else in our
         | space does. So, having insights on which ones are getting
         | traction and which ones need improvement help us spend our
         | marketing and engineering time better to attract new customers
         | and retain our existing ones.
         | 
         | I can't imagine not having any analytics. I feel like we need
         | them to continually make small course corrections that ensure
         | we're providing the best value to our existing customers.
        
           | GordonS wrote:
           | > We actually don't care who each user is, we just want the
           | aggregate data.
           | 
           | > It's very important for us to know which of those functions
           | are being tried, used, and reused (or not)
           | 
           | Aren't these 2 at odds with each other, or else how can you
           | tell when the same person re-uses a feature? Surely you need
           | some kind of user identifier for that?
        
             | bonestamp2 wrote:
             | Yes, good point, I should have been more clear. You're
             | right that we track unique users, but we do so in abstract.
             | I suppose with some work we could pull together enough data
             | from different sources to determine who a particular user
             | was in a particular session. I just meant that we don't do
             | that and it's not easy for us to do that because we don't
             | care about that type of data. We only care about what each
             | abstract user does in the sense that we want to know the
             | aggregate of how many users did that thing.
             | 
             | Additionally, unlike google analytics, we do support the
             | browser's "do not track" flag. So if a user doesn't want to
             | be tracked at all, we completely respect that.
        
         | moron4hire wrote:
         | This was my experience as well. Early on, when I was still
         | learning, it helped me learn the importance of extremely low
         | Time-To-First-Paint. But now, I just default to good designs
         | that are easy to read. There's nothing new to be learned that
         | GA could provide insight on.
         | 
         | Also, early on, I found the Referrer tracking to be useful for
         | discovering the reach of my projects and to help me get into
         | conversations with users on other sites to help them with using
         | my software. But that feature of GA eventually became useless
         | when Google did nothing to address Referrer spam.
        
         | nofunsir wrote:
         | Good for you! This is the right method. Analytics -- in the
         | sense of watching mouse clicks and reference urls -- are only
         | trying to close the feedback loop FAR sooner than when it
         | actually matters.
        
         | chrispauley wrote:
         | I feel the same way.. mostly. I want one thing from GA and that
         | is user flow. I've first hand seen this help a startup run
         | landing page experiments and very quickly to improve their
         | conversion rates. (The startup was selling a luxury consumer
         | product)
         | 
         | This won't work for everyone and will be of little value to
         | many. However that tool just doesn't seem to exist on most if
         | not all of the lightweight alternatives. Matomo is the only
         | alternative I have seen implement this feature, though those
         | with more experience with the alternatives will hopefully show
         | me I am wrong on that.
        
         | drunkpotato wrote:
         | I also don't think google analytics is that useful. It's one of
         | many possible tools for user tracking, and far from the best
         | one. Thought it's "free" so I guess that's why it's more
         | popular than much better tools.
         | 
         | But the fundamental problem is that analytics provides data and
         | information, when what people want is knowledge and wisdom. But
         | you need to do the work to get it. No amount of analytics is
         | going to tell you that your Facebook ads are underperforming
         | their potential because your buy button is hidden by an overlay
         | in the built-in Facebook browser, especially if Facebook is
         | your best performing channel overall. Analytics can't tell you
         | what's _not_ there. There's no substitute for having developers
         | and marketers actually purchase your product, themselves, on
         | the channels your customers use. And too many companies don't
         | do that.
         | 
         | Just a hypothesis I have, that most e-commerce companies are
         | leaving millions to tens of millions of dollars on the table by
         | not giving all their employees a corporate credit card and
         | having them purchase their product on it, say, once a month.
         | They're losing out on far more than they'd end up paying for
         | the few instances of fraud.
        
           | westicecoast32 wrote:
           | For a personal project everyone basically ignored every post
           | I made about it. I used server logs to generate analytics and
           | noticed people hit the first page and left (I assume they
           | spent seconds and left).
           | 
           | I then broke up the main page to many smaller pages and
           | notice people still leaving right away but my documentation
           | page got more overall clicks since the main page didn't
           | massively overload them
           | 
           | I guess analytics don't give you answers but know what pages
           | they tend to click on and what happens when you make a page
           | more simple/more heavy you can figure out a solution
           | 
           | However I didn't end up finding a solution and I'm planning
           | to rewrite my project. I have a better idea how I should
           | introduce it to people next time
        
         | jayd16 wrote:
         | Do you not find value in adding events to your pages so you
         | have user funnels for conversions? Or do you think bots cause
         | too much noise for this?
        
       | yagodragon wrote:
       | I'd like to ditch google analytics for a new small side project
       | I'm building. I live in a small country in Europe and for me the
       | most important feature of these alternatives is the cookieless
       | tracking and the lightweight scripts. However, the pricing is too
       | steep for a project that won't gain more than thousands users.
       | 
       | Fathom analytics and simple analytics cost ~100$/year.
       | 
       | Plausible costs ~50$
       | 
       | I really liked and almost settled with plausible but I just saw
       | goatcounter right now. It's free for personal / open source
       | projects. That's so nice for small projects like many people here
       | are building.
        
         | sleepyhead wrote:
         | Fathom and Plausible are both open source:
         | 
         | https://github.com/usefathom/fathom
         | https://github.com/plausible/analytics
        
         | woudsma wrote:
         | You should check out Matomo (formerly Piwik), which is a free
         | self-hosted GA alternative. I'm very satisfied with it so far.
         | It disables tracking by default - I think, but this is
         | configurable in the Matomo dashboard.
        
       | jgillich wrote:
       | I recently started using Kindmetrics, a very simple analytics
       | tool written in Crystal. Looks very similar to Plausible
       | actually, it may have been inspired by it.
       | 
       | https://kindmetrics.io/
       | 
       | https://github.com/kindmetrics/kindmetrics
        
       | neilsimp1 wrote:
       | Off topic, but I used to run a website that had Google Analytics.
       | This site and domain are now 100% down and have been for over a
       | year.
       | 
       | I _still_ get monthly emails from Google about the analytics for
       | this website. Apparently it 's getting 200-300 visitors per month
       | still. I have replied back to Google vie email about this several
       | times but never heard any reply. I wonder what site they are
       | tracking?
        
         | TomGullen wrote:
         | IIRC the measurement protocol can be used to send fake traffic
         | by bad actors.
        
         | infinitelurker wrote:
         | Ghost/spam traffic is a real problem for GA. UA codes are
         | public and can be targeted with spam referrals or simply
         | randomly hit (especially for UA codes that end in -1).
         | 
         | Filtering spam and getting useful data on GA is a never ending
         | job that Google keeps making harder. (re removal of Service
         | Provider / Network Domain [1])
         | 
         | [1]: https://support.google.com/analytics/thread/27808046?hl=en
        
         | TomAnthony wrote:
         | It is quite possible to take a GA tracking code for one site
         | and put it on another site. This has happened to me quite a lot
         | where people have lifted content or copied code from my site.
         | You can see the hostnames in GA (you have to dig for it), which
         | could explain this.
        
       | marvinblum wrote:
       | Does anyone have experience with passive fingerprinting? I
       | thought about implementing it into our Go backend as a middleware
       | of some kind and track that way. I haven't found anything like it
       | so far, but it would be ideal to track without cookies.
        
         | propelol wrote:
         | Device fingerprinting falls under European data protection
         | laws, so might as well just use cookies if you're going that
         | path.
        
           | marvinblum wrote:
           | That's really unfortunate. Well, at least it's easier to
           | implement a cookie solution then.
        
         | Carpetsmoker wrote:
         | I did a write-up of solutions that I'm aware of here:
         | https://github.com/zgoat/goatcounter/blob/master/docs/sessio...
         | - happy to add more if anyone knows of any.
        
           | marvinblum wrote:
           | Thank you!
        
           | JackWritesCode wrote:
           | The Fathom one isn't accurate and references an old article.
           | I'll submit a PR in a few weeks if I remember :)
        
       | ent101 wrote:
       | A lot of people may not care about this, but Google Analytics
       | (and another 3rd party, hosted analytics platforms) are very
       | important when trying to sell your website. Basically, it allows
       | the buyer to access reliable historical data about your website
       | which in turn makes it easier to arrive at a valuation.
        
       | pantulis wrote:
       | Of course you can now do Google Analytics server-side with GTM:
       | https://www.optimics.cz/what-is-gtm-server-side-tracking-and...
       | 
       | Have GCP? Hit this button and you get your docker container in
       | your Kubernetes cluster doing all the stuff for you, pretty
       | awesome.
        
       | ilovefood wrote:
       | I've recently whipped up my own self-hosted analytics solution
       | [0] based on SQLite, Bash and Metabase. It's all self hosted,
       | easy to install and very flexible with regards to the queries you
       | can write and display. Metabase comes with a lot of cool features
       | for display, live reload and other cool stuff. :)
       | 
       | [0]: https://funnybretzel.com/self-hosted-analytics-using-
       | sqlite-...
        
         | zubspace wrote:
         | Thanks for the tutorial. Looks interesting.
         | 
         | I'm a fan of GoAccess. Unfortunately the queries are pre-made
         | and there are nearly no options whatsoever. You can't (yet)
         | filter by date for example.
         | 
         | One thing I realized is, that on small sites, like my blog, an
         | overwhelming amount of traffic comes from search engines or
         | bots which are looking for vulnerabilities. Filtering them out
         | takes a lot of time in any self-hosted or self-made solution.
        
           | leephillips wrote:
           | I use GoAccess too. If I want to filter by date (for example)
           | I just run the log file through sed before feeding it to
           | GoAccess.
        
         | wolfhumble wrote:
         | Looks like a nice combo that can be used for other projects as
         | well; thanks for the writeup! :-)
        
         | dimovich wrote:
         | Thank you for the writeup. Always nice to see new applications
         | of Metabase. Will try it out.
        
       | ckotso wrote:
       | Snowplow is also an option. It's an open-source data collection
       | solution that, unlike GA, gives you full ownership of your event-
       | level data and the freedom to define your own data structures.
       | Not exactly what you'd call 'lightweight' but quite a few
       | Snowplow users/customers have come from GA for the level of
       | flexibility and control they can have over their data sets.
       | 
       | (Full disclosure: I work for Snowplow Analytics)
       | 
       | - https://github.com/snowplow/snowplow
       | 
       | - https://snowplowanalytics.com/
        
         | ValentineC wrote:
         | I've setup the Snowplow collector and tracker on some of my
         | sites because that part is _very_ straightforward (and the
         | tutorial on the wiki is great), but I 've never gotten past
         | those steps to analyse the data collected.
         | 
         | Is there a highly-opinionated tutorial that shows how one can
         | get some vanity metrics out from Snowplow?
        
       | h4kor wrote:
       | If you only need page views, for example for a personal blog, try
       | goaccess (https://goaccess.io/).
       | 
       | It uses your server logs for simple analytics.
        
         | JackWritesCode wrote:
         | Just be sure to check with your legal team regarding using web
         | logs for analytics purposes
        
         | FalconSensei wrote:
         | But then you can't use it on Github Pages, right?
        
       | limeblack wrote:
       | For static web pages I like https://disqus.com Has comment and
       | minor analytics support.
        
       | Clex wrote:
       | Google Analytics has a "bot filtering" option that works pretty
       | well (even though it's not perfect). Do the alternatives also
       | have similar features? There is a lot of automated traffic on the
       | internet.
        
         | jka wrote:
         | It looks like GoatCounter does; it uses a library,
         | https://github.com/zgoat/isbot by the same author to detect
         | bots.
         | 
         | Detecting and deterring spam and sketchy behaviour while using
         | open source software could be an interesting technical problem
         | area.
        
         | JackWritesCode wrote:
         | Of course. Fathom filters bots on the client side and then
         | looks for bot signs on the server. If it looks questionable, it
         | doesn't process it.
        
         | dx034 wrote:
         | Matomo has the same, works pretty well. Via plugin they can
         | also track bots to show that separately but they're filtered by
         | default.
        
           | VadimPR wrote:
           | I'm not sure if it's actually filtered - I think they're just
           | tracked and classified. At least that's what I'm seeing in my
           | instance.
        
       | devalnor wrote:
       | I use Ackee, also a simple and open source alternative
       | https://ackee.electerious.com/
        
         | MentallyRetired wrote:
         | Came here to share Ackee. I haven't used it yet but it looks
         | really nice.
        
         | jboynyc wrote:
         | Another alternative not listed in the article:
         | https://www.offen.dev/
        
       | kasbah wrote:
       | I recently had a discussion about the interface with the
       | Goatcounter developer [1]. Also put in a feature request with
       | Posthog [2]. Hadn't heard of Plausible, maybe that's the one for
       | me!
       | 
       | [1]: https://github.com/zgoat/goatcounter/issues/302
       | 
       | [2]: https://github.com/PostHog/posthog/issues/1020
        
         | Hoasi wrote:
         | > Hadn't heard of Plausible, maybe that's the one for me!
         | 
         | Plausible is pretty good, found it useful to monitor traffic
         | and usage for small projects.
        
       | juanre wrote:
       | I stick to GA because I fear that removing it could impact my
       | site's ranking. Is it an unfounded fear, anyone knows?
        
         | pauljarvis wrote:
         | We actually looked into this:
         | 
         | https://usefathom.com/blog/google-analytics-seo
         | 
         | Funny as it sounds, using Fathom instead of GA could increase
         | your SEO rankings :)
        
       | joppy wrote:
       | What kind of server-side analytics are people using today, for
       | personal blogs and things? Projects like GoAccess which eat an
       | nginx log file and output some analytics seem like a nice middle
       | ground for those of us who want some feedback on how people are
       | using a website, without needing all the bells and whistles of
       | something more like Google Analytics (not to mention the fact
       | that it doesn't need any Javascript loaded or anything).
       | Personally I've found GoAccess pretty good, but the interface a
       | little difficult to use and understand, so I'm looking for
       | projects like it.
        
         | PaulRobinson wrote:
         | Server side was how it was always done back in the early days
         | of the web, and analog[0] was state of the art.
         | 
         | Around 1999/2000 there was a rise of ISPs needing to install
         | reverse proxy caches because the growth of consumer access
         | meant they were getting seriously contended on upstream access.
         | I was working at the time at a UK 0845 white label ISP called
         | Telinco (was behind Connect Free, Totalise, Current Bun and
         | other 0845 ISPs), and to my knowledge we were the first in the
         | UK to install a Netapps cache. It was the moment we realised
         | (by checking the logs to see if it was working), just how much
         | porn our customers were accessing.
         | 
         | Those caches blow server side analytics to pieces, because
         | frequently you wouldn't even know the user had hit the page.
         | What server side analytics was useful for is what we'd now call
         | Observability: they gave reasonable Latency, Error Rate and
         | Throughput metrics, which combined with some other system logs
         | might also give you a sense of Saturation.
         | 
         | As such, they were not too useful for marketing. Google
         | Analytics was the first product that allowed high fidelity
         | analytics even if reverse proxy caches (and even browser
         | caches), were all over the place.
         | 
         | And here we are. In a World where we are tightly surveilled by
         | corporate entities in order to try and get us to click on
         | things. Bit sad really.
         | 
         | I'd encourage people to think about what they need these
         | analytics for.
         | 
         | If it's marketing, you might just as well using GA: it's the
         | best product out there. We just need to lobby for better
         | regulation (at least GDPR and cookie setting popovers give us
         | choices on that regard now).
         | 
         | If you're stroking your ego, consider whether such an invasive
         | technology is worth the price, and if you need those numbers.
         | 
         | If you're making sure your infrastructure can handle the
         | traffic, use server side analytics alone. Parse your logs using
         | the huge number of tools out there able to do that in near-
         | realtime, and leave your users' browsers free of tracking
         | cookies and javascript.
         | 
         | [0] https://www.web42.com/analog/
        
           | Nextgrid wrote:
           | Caches are irrelevant now that the world has moved to HTTPS.
        
             | divbzero wrote:
             | There are no public caches with HTTPS, but there are still
             | private browser and CDN caches to contend with.
             | 
             | To ensure your origin server is hit on subsequent HTTPS
             | requests, you would still need to configure response
             | headers for cache validation [1] to be
             | Cache-Control: no-cache
             | 
             | instead of                 Cache-Control: private
             | 
             | [1]: https://developer.mozilla.org/en-
             | US/docs/Web/HTTP/Headers/Ca...
        
               | Nextgrid wrote:
               | Yes, but those are caches that you control, so when it
               | comes to analytics you would get the logs off them too in
               | order to get accurate metrics.
        
         | dig1 wrote:
         | Webalizer [1] can be alternative. No fancy UI, but gets job
         | done. For anything heavier and serious, ELK [2].
         | 
         | [1] http://www.webalizer.org/
         | 
         | [2] https://www.elastic.co/what-is/elk-stack
        
         | stephenr wrote:
         | I've setup GoAccess for a client's site, the problem is it
         | doesn't have a great HA solution.
         | 
         | You either ship all your logs to one place (and hope that place
         | doesn't go offline) or ship your logs to multiple places and
         | hope both destinations are in sync. We've opted for #2 right
         | now (hint: it's not perfect) but it's made me think about
         | writing an alternative.
         | 
         | Rather than shipping all the logs all around, my plan is to
         | have each source (i.e. web server) run a process on it's own
         | logs, and use something like Redis to store the aggregated
         | statistics.
        
       | Cenk wrote:
       | I've been pretty happy with Matomo (formerly Piwik), especially
       | their non-cookie mode. But the interface is ugly, confusing, and
       | makes finding information much more difficult than Google
       | Analytics does.
       | 
       | Edit: One major thing I am unhappy with in Matomo is event
       | tracking. GA makes it much easier (in my experience) to track
       | conversions and events, and presents the data in a better way.
        
         | XCSme wrote:
         | Hi Cenk, I have been building a tool[0] similar to Matomo, but
         | the plan was to make the UI much simpler to use but also
         | provide more premium functionalities (heatmaps, session
         | recordings) for cheaper.
         | 
         | My idea was to focus everything on "segments". So for all the
         | data you can quickly create user segments and instantly filter
         | the data to see only stats for the users you want, or compare
         | stats between segments.
         | 
         | There is a public dashboard that you can check, I would love
         | some feedback if you have the time :)
         | 
         | [0]: https://usertrack.net/
        
         | VadimPR wrote:
         | I found the Matomo interface to be a breath of fresh air
         | compared to Google Analytics! As a non-power user, GA was too
         | heavy and enterprise-like. Matomo is much cleaner, simpler, and
         | more efficient for me to work with.
        
           | Cenk wrote:
           | I'm surprised to hear that! Are you using a different theme?
        
             | VadimPR wrote:
             | I don't think so - just the default. The light version of
             | https://themes.matomo.org/DarkTheme#preview.
        
               | Cenk wrote:
               | Thanks - I'll give it a try
        
       | pier25 wrote:
       | In my blog I started using Netlify's analytics which are server
       | based and cost $9 per month (up to a number of views) and I gotta
       | say they are extremely basic and lackluster. I paid for the
       | subscription for one month but I don't think I will keep on
       | paying.
       | 
       | Edit:
       | 
       | Also it doesn't seem to be tracking referrers correctly.
        
         | JackWritesCode wrote:
         | Yes, Netlify Analytics are bad. Try Fathom.
        
           | pier25 wrote:
           | I'm sure it's a better service but at $14 per month (or $140
           | per year) for a low traffic blog it's too expensive. I don't
           | need anything that fancy.
           | 
           | I moved to Netlify analytics from Goat counter because I
           | liked the idea of having server side analytics, but at $9 per
           | month these are extremely overpriced.
           | 
           | I think I will just go back to Goat counter.
        
       | tannhaeuser wrote:
       | I really hope an analytic genius can come up with a technique
       | (like differential privacy, but I'm no expert here) that would
       | give advertisers what they want (unique visitor counts, and very
       | few other metrics) to place ads on sites, yet doesn't give away
       | too much privacy, nor leads to enslavement under a single central
       | entity. I guess if something like that doesn't come along, then
       | only old school content-based ads (site sponsoring) without any
       | tracking can be considered ethical (or no-ads of course). The
       | argument against content-based ads was always that it doesn't
       | suffice to finance even web hosting let alone content production.
       | But with ad prices going to the bottom, I wonder if the figures
       | still add up in favour of targetted ads today.
        
         | lmkg wrote:
         | Apple/Webkit already has a proposal along those lines.
         | 
         | https://webkit.org/blog/8943/privacy-preserving-ad-click-att...
        
       | gramakri wrote:
       | We have been using a self-hosted matomo for our company site for
       | years now (back from when it was called piwik). Highly recommend
       | it! The satisfaction you get out of not using any google product
       | is unsurpassed.
        
       ___________________________________________________________________
       (page generated 2020-06-18 23:00 UTC)