STFN

The decline of Google and rise of alternative searches as the source of traffic

Click to skip the introduction and go straight to the results.

The first part of the story is that, as I already wrote here and here, I am using self-hosted Umami as the analytics engine for this blog. I am using it because I am curious to know how many people visit my blog, and I like numbers and graphs. Of course in this day and age, saying "humans" is a stretch, because you can never be sure if a visit is a human, or a bot. Umami does filter out a lot of the automated traffic because it requires JS on the client to count the visit, but it is not foolproof.

The second part is that lately I've been into Data Science and ML, I am doing an Associate Data Scientist course at datacamp.com, and combining it with exercises at Kaggle. Right now I am focusing on Pandas, sadly not the monochrome bears, but a library for data analysis and manipulation.

And the third part is that for the last months I have been witnessing a steady decline of visits coming to my blog from Google, and on the other appendage, a rise with visitors finding my blog through other search engines. This seems to be in line with the fact that Google is pushing more and more AI into its search engine to the point that it plans to no longer show search results as links, and that causes a reaction of people leaving Google Search for other search places.

This blog post is an attempt to connect those dots, and see in concrete numbers how the visits coming from various search engines have been changing for the last two years. Two years, because I have been using Umami since July 2024.

Here's how I did it.

Data Extraction

I am running Umami as a Docker Compose service in one of my VPSes. The service consists of the web worker container, and a Postgres container running the database. I cannot, and don't want to connect straight to that database, because it is not reachable outside the Docker internal network, and should not be.

The first step was to create a dump of the database by running the command below on the VPS:

docker compose exec -t db pg_dumpall -U umami > dump.sql

This extracts the database into a .sql file. I then scp it into my laptop.

I have a PostgreSQL database running in an LXC container in my Proxmox node in the homelab, which is easily reachable from my laptop. So I decided that the next step was to scp the dump file to the database LXC, restore the dump to that database, and use it to query the data. But first I had to create the user and the database in my local PostgreSQL.

su postgres
createuser --pwprompt umami
createdb -O umami umami
# providing host enforces password authentication
psql --username umami -h localhost -W umami < dump.sql 

Now I have a mirror of the Umami database in my local homelab to which I can easily connect and ingest the data.

Data Analysis

During the last P.I.W.O (PoznaƄ Free Software Fest) I attended a lecture about Marimo, which aims to be the next gen replacement of Jupyter Notebooks. The next day I installed it on my laptop and got instantly hooked, Marimo feels so nice, much more polished than Jupyter, and fits perfectly into my current Data Science plot arc. And so it has been the tool of choice for this investigation.

Here's how I did the analysis in Python and Pandas:

First, connecting to the database and fetching the data:

import pandas as pd
import psycopg2
from sqlalchemy import create_engine

host = "192.168.88.XXX"
port = 5432
dbname = "umami"
username = "umami"
pwd = "correcthorsebatterystaple"

engine = create_engine(f'postgresql+psycopg2://{username}:{pwd}@{host}/{umami}',connect_args={"options": "-c client_encoding=utf8"})

visits = pd.read_sql("SELECT * FROM website_event where website_id = 'b28dd954-acaa-48b1-a1db-dd161dd35d98';", con=engine)

This is basic, self-explanatory Python. Import the packages, define the variables for the connection. Then use a popular SQL ORM, SQLAlchemy to do the actual connection handling, and finally load the table into a Pandas Dataframe. website_event is the table storing the visits to my blog, which each row being a single visits to a page. One thing of note is the WHERE clause, I have more than one website tracked in Umami, so I had to filter the visits to include this blog only.

And the analysis itself:

summary = (
    visits
    .groupby(pd.Grouper(key='created_at', freq='MS'))
    .agg(
        all_visits=("event_id", "count"),
        google_visits=('referrer_domain', lambda x: x.str.contains('google', case=False, na=False).sum()),
        ddg_visits=('referrer_domain', lambda x: x.str.contains('duckduckgo', case=False, na=False).sum()),
        bing_visits=('referrer_domain', lambda x: x.str.contains('bing', case=False, na=False).sum()),
        ecosia_visits=('referrer_domain', lambda x: x.str.contains('ecosia', case=False, na=False).sum()),
        qwant_visits=('referrer_domain', lambda x: x.str.contains('qwant', case=False, na=False).sum()),
        startpage_visits=('referrer_domain', lambda x: x.str.contains('startpage', case=False, na=False).sum()),
        kagi_visits=('referrer_domain', lambda x: x.str.contains('kagi', case=False, na=False).sum()),
        brave_visits=('referrer_domain', lambda x: x.str.contains('search.brave', case=False, na=False).sum())
    )
      .reset_index()
      .sort_values(by='created_at')
)

summary["non_google_visits"] = (
    summary["ddg_visits"] 
    + summary["bing_visits"]
    + summary["ecosia_visits"]
    + summary["qwant_visits"]
    + summary["startpage_visits"]
    + summary["kagi_visits"] 
    + summary["brave_visits"]
)
summary["ratio"] = summary["non_google_visits"] / summary["google_visits"] * 100

# drop the row because it's for the current month which has not finished yet
summary.drop(summary.tail(1).index, inplace=True)

final = summary[["created_at", "google_visits", "non_google_visits", "ratio"]]

This is where things get serious. First I take the dataframe and group it by months. Then I do the aggregation, collecting the counts of traffic coming from different search engines based on the referrer of the request.

The next step is to add a column with the sum of all visits coming from search engines other than Google.

Finally, the metric I chose is the percentage of Google traffic to non-Google traffic.

It works so that if in a given month I had the same number of visits from Google as from other search engines, the percentage is 100%. If for every 10 visits coming from Google I had four from other places, the result is 40%. Not sure if this is the correct way, probably not? I am just a software developer, I don't have a theoretical background in maths or statistics. If you know the industry standard for this, let me know! Nonetheless, this metric gave me the answer to my question, so I guess it did ok?

Another cool thing about Marimo is that it has built-in tools for creating graphs from dataframes, and this is what I used to create the visualisations below.

And?

Is Google Search a declining source of traffic for my blog?

Graph showing the ratio of visits coming from Google vs other search engines,
the graph shows that non-google search engines usage keeps
growing

Graph showing the number of monthly visits coming form Google, there is a
rathe clear downward trend with a single outlier
month.

The answer is: I would say yes? The graph speaks for itself. In 2024, non-Google search engine traffic was around 10-15% of the number of visits from Google, and in the last months it has been around 35%. That downward spike in October of 2025 is an outlier, that month my blog post about the Orange Pi Zero 3 was promoted on the Google discover page and for two days I was getting a lot of visits from there.

The second graph is the monthly number of visits from Google, showing a downward trend, again with that single outlier month.

The bottom line

I hope that this is good news. The biggest strengths of the Internet are its decentralisation and egalitarianism, and the monopoly of Google search has been the total opposite of those. I am happy to see that other search engines are growing in popularity and I hope this trend will continue. Web search should be as it says on the tin, search providing links to websites, and replacing links with an AI generated summary with a high chance of malforming their content is just not the right way, dangerous for the openness and freedom of the web.

P.S. Personally I use Startpage as my default search engine.

Thanks for reading!


Previous: