STFN

How to set up Umami.is, a self-hosted analytics service

20 minutes

Disclamer

I know that this article will be about touchy subjects, so let’s start with the important stuff:

With that out of the way, let’s continue.

A graph with
blue columns on a white background, showing my blog's hourly trafficUmami dashboard

Why move to self-hosted analytics

Since this blog started at the end of 2022, I have been using GoatCounter as my source of analytics.

It’s free, clean, and simple to use. GC offers a free hosted service, so getting started is only a matter of setting up an account and adding a small piece of HTML to your website. I still highly recommend it for anyone wanting to have a turn-key solution for their website. And they are donations based, you can help them here.

The main issue I was having with GoatCounter was that every adblocker was cutting them out, as the JS script used for counting visits was calling a well-known, external service. I could clearly see in the nginx logs that I was getting much more traffic than GC recorded.

On the other hand, a self-hosted analytics service is hosted, and communicates only with your own server, and adblockers have no reason to block it unless specifically provided with the server’s domain.

And also, you know how it is with homelabbing and self-hosting, from time to time you want to check out different tools :) I could have switched to a self-hosted GC instance, but I wanted some change, and so I tried out Umami, and so far I have been very happy with it.

So let me share with you how I installed and configured Umami.

Installation

As for the installation, I went the way I usually go with such services, using Docker Compose to run them in containers.

(all terminal snippets taken from Umami docs)

git clone https://github.com/umami-software/umami.git
docker-compose up -d

This will get you a database and a Umami worker available at port 3000. If you installed it in your VPS, you should be able to access it at http://<domain>:3000. It’s possible you will need to add a custom rule to your firewall to allow access at that port. Don’t worry, it’s only temporary.

You can now visit Umami’s dashboard and set up a secure admin password.

Do not add a website yet! First we need to set up a proper way of accessing Umami on the server, and for that we need to have a subdomain and configure it in nginx.

Using a subdomain

Accessing Umami over a non-standard port and without HTTPS is by far not a good solution. What I did is I set up a subdomain with my domain provider and configured it in nginx.

To set up a subdomain I opened the admin dashboard at the company’s website where I have all my domains, went to my main domain stfn.pl, an added a subdomain umami.stfn.pl. I pointed that subdomain to the same IP address as the main domain. It will be nginx’s responsibility to sort out the traffic.

Nginx reverse proxy config

I am using nginx as my HTTP server. Adhering to best practices, I have created a separate configuration file at /etc/nginx/conf.d/umami.conf. I am using nginx’s functionality of “reverse proxy”, in which it is passing request coming to a specific domain, towards a service running in the backend, and passing them from the standard HTTP port 80 to a different one, used by the backend service.

The configuration is rather straightforward. server_name is the domain which this config file is monitoring. proxy_pass defines where the requests should be passed.

The different proxy_set_header stanzas define which headers should be carried on to the backend. I added them, as without them the country information would not reach Umami. The issue is described in this GitHub issue.

server {
    server_name umami.stfn.pl;
    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "Upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Proto https;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Host  $host;
    }

}

Once you have the config file saved, it’s good practice to first test the new config using

sudo nginx -t

Nginx will tell you if there are any problems with the configuration file. If all is fine, what is left is to restart nginx

sudo systemctl restart nginx.service

Now Umami should be available at the subdomain you defined. Remove the firewall rule for port 3000, it won’t be needed anymore.

Next part is setting up HTTPS. For that I am using Certbot. Setting up Cerbot is very easy, everything is described on this single page.

Certbot will automatically updated your nginx config, adding all of the required stanzas. Once Certbot does its thing, you will be only able to access your new subdomain with HTTPS.

Configure Umami for data collection

Configuring Umami for data collection is described here and here in the docs.

Once you go through those steps, you should start seeing traffic on your Umami dashboard. But there is one issue, anytime you access your website, it will be also counted, and will skew the results. But there are ways to mitigate this issue.

Disabling Umami for dev work and local machines

For my blog I am using Astro.js, which is based on React. React allows to differentiate between the development and production states using environmental variables. To disable Umami collecting data when I am developing my blog, I modified the tracking code:

<head>
	{
		import.meta.env.MODE == "production" && (
			<script
				defer
				src="link"
				data-website-id="xxx"
			/>
		)
	}
</head>

This way the script will only render when the site is exported using yarn build.

To disable counting my visits to my blog, I used the solution described in this Github Issue . Adding an env variable in your browser’s local storage will stop Umami from collecting data on a given webpage. To do so, run the code below in your browser’s console in the dev tools, when your site is opened:

localStorage.setItem('umami.disabled', 1)

Bottom text

And that’s it! I am happy to switch to a self-hosted analytics. I feel that I am more in control of what and how is being gathered, I can access the raw data in the database container, and I am not on the mercy of any external service.

There are of course downsides of self hosting stuff. One is that there is a visible increase in the server’s load, as it needs to run the required docker containers. If you are using a very, very low-end machine for your website, that might become an issue, especially during traffic spikes. Another issue is that you are storing the data, and you are responsible for backups. Which of course can be seen either as a problem, or as an opportunity to practice backups and backup restoration. Depends on your viewpoint :)

Thanks for reading!

If you enjoyed this post, please consider helping me make new projects by supporting me on the following crowdfunding sites: