GA4 → Rybbit migration: self-host TCO, real event mapping, and the contingency plan if Rybbit goes away

By Lucas Brandao · São Paulo · verified 2026-05-05 · edit on GitHub

Rybbit is the new arrival in the open-source analytics shelf — under twelve months old, AGPL-3.0, ClickHouse-backed, with a dashboard that copies GA4's information density rather than fighting it. That makes it the most "GA4-shaped" alternative I have tested. It also means Rybbit is the riskiest pick on maturity grounds, so this page front-loads two things the other migrations bury: the real Hetzner self-host TCO and the exit plan for the day Rybbit's GitHub goes quiet. Test stand below: a Hetzner CX22 running docker compose up, a Hugo site, 80,000 pageviews, three weeks of parallel data, and the schema dump I keep on a quarterly cron.

Figure 1. Rybbit migration timeline. Three-week parallel run because the platform is young — extra week buys you a second weekend cycle and a second Tuesday-morning peak before you commit. Quarterly hedge step is unique to Rybbit; the other migrations skip it.

Why teams move from GA4 to Rybbit (and who shouldn't)

Three triggers, all narrower than the Plausible or Matomo audience. The first is GA4-ergonomic burnout — engineers who liked Universal Analytics, never made peace with GA4's report-builder, and want a dashboard that shows pageviews in three clicks instead of seven. Rybbit's UI is the closest thing in the open-source space to "what GA4 would look like if Google had not over-engineered it."

The second is privacy without the lecture — a cookieless default that does not require a consent banner under current EDPB guidance, but with enough event flexibility to cover an e-commerce funnel. Plausible and Umami have the cookieless story too; Rybbit's distinctive contribution is keeping the event model close enough to GA4 that the migration spreadsheet is short.

The third is GA4 BigQuery cost. At 2 M events/month I was paying around $155/month for streaming exports plus storage. The Hetzner CX22 self-host runs €4.51/month — the math is not subtle. Cuts $1,840/yr at 2M events.

If you are not in those buckets, save the project. Rybbit is the wrong move when you do under 1,000 visits/day (use Plausible Cloud or the umami.is free tier — the self-host overhead is not worth it), when you have a Looker Studio reporting pipeline you cannot rebuild (Rybbit has no native Looker connector), when AdSense or Google Ads attribution is load-bearing in your reports, or when your team has zero DevOps capacity to keep a Docker host alive. The "free as in beer" framing is misleading: you trade dollars for hours, and someone has to do the upgrades.

What Rybbit replaces in GA4, and what it doesn't

The honest matrix. Rybbit gets you most of the GA4 surface that engineers actually use, plus the parts Plausible omits.

Capability	GA4	Rybbit	Plausible	Notes
Pageviews / sessions	✓	✓	✓	Rybbit uses 30-min window, matches GA4
Custom events	✓	✓ JS API	✓ goal-based	`customEvent({name, properties})`
Funnels (multi-step)	✓	✓	Cloud paid only	built-in, no goal-creation step
Goals / conversions	✓	✓	✓	Rybbit goals derive from custom events
Revenue tracking	✓	✓	partial	last-click, no MTA model
License	proprietary	AGPL-3.0	AGPL-3.0	both copyleft, fork-friendly
Database	BigQuery	ClickHouse + Postgres	ClickHouse + Postgres	portable schema, dump/restore works
BigQuery export	✓	—	—	SQL-direct on ClickHouse instead
Looker Studio connector	✓	—	—	community-built only
AdSense / Ads attribution	✓	—	—	not on roadmap
Heatmaps / session replay	partial	—	—	none of the three OSS tools
Audiences / segments	✓	partial	—	filter-based, not stored cohorts

If your weekly workflow includes "schedule a Looker Studio email to the marketing director," Rybbit is not your tool — you would either build a Grafana dashboard against ClickHouse or stay on GA4. If your funnel reporting needs multi-touch attribution across paid and organic, the same caveat applies. Rybbit is for teams whose analytics question is "what happened on the site this week," not "what was the assisted-conversion path of the August campaign."

Privacy, the cookie banner, and EU hosting

Rybbit's default mode is cookieless: no _ga, no fingerprint, no persistent identifier across days. Visitor identification is a daily-rotating salted hash of (IP + user-agent + site-id), which under the EDPB's December 2024 guidance on cookieless trackers does not require consent in most EU jurisdictions. The current Schrems II posture treats US-hosted analytics as a transfer risk for personal data; Rybbit self-hosted in Hetzner Falkenstein (FSN1) sidesteps that entirely.

The pragmatic recipe I use: pick the Falkenstein region for the CX22, set the Caddy auto-TLS to your apex domain, set RYBBIT_PRIVACY_LEVEL=strict, and skip the consent banner. If your legal team insists on one anyway, Rybbit honours a global data-do-not-track attribute the same way Plausible and Umami do — flip it from your existing CMP and the script falls silent. One overused phrase warning: this is not "privacy-first" magic, it is just less data being collected. The right framing is "less footprint" rather than a marketing posture.

Mapping GA4 events to Rybbit events and properties

This is where Rybbit differs most visibly from Plausible (UI-first goals) and Umami (four auto-events). Rybbit ships two primary tracking calls — pageview() automatic on script load, and customEvent({name, properties}) for everything else — plus a small set of attribute-driven helpers for outbound clicks and downloads.

Figure 2. GA4 events flow into two Rybbit primitives — automatic helpers for the four standard signals, plus customEvent() for everything else. Properties are flat key-value, capped at five per event. user_engagement has no Rybbit equivalent and gets dropped.

GA4 event	Rybbit call	Properties / notes
`page_view`	`pageview()` auto	fires on script load + History API hooks
`scroll`	auto-helper	default 90 % threshold, configurable
`click` (outbound)	auto-helper	matches external `<a href>`
`file_download`	auto-helper	extension list editable in dashboard
`purchase`	`customEvent('purchase', {…})`	value, currency, transaction_id, plan
`begin_checkout`	`customEvent('checkout_start')`	step 1 of funnel
`add_to_cart`	`customEvent('add_to_cart')`	flatten item list to JSON string
`video_progress`	`customEvent('video_25/50/75/100')`	one event per quartile, manual
`user_engagement`	—	dropped, no equivalent
`login` / `sign_up`	`customEvent`	method as property
scoped custom dim.	flatten to property	no user-scoped joins
session-scoped attr.	—	dropped, write event-scoped instead

Three categories of pain show up consistently: nested item arrays (e-commerce items[] has to be JSON-stringified into a single property), user-scoped dimensions (Rybbit has no user store, so plan-tier and lifetime-value have to ride along on every event that needs them), and engagement heuristics (the GA4 user_engagement 10-second-active timer has no clean port). Of 100 GA4 events I mapped on the test stand, 89 mapped cleanly via auto-helpers or one customEvent call, 24 needed manual code (the e-commerce funnel and video quartiles), and 7 were dropped — user_engagement, two session-scoped audience signals, and four custom-dimension chains that depended on user-store joins Rybbit does not offer. That ratio sits between Fathom's 98/18/4 and Plausible's 87/23/10 — roughly comparable to Umami's 92/21/7.

The before/after JSON for a purchase event:

// GA4 dataLayer push
gtag('event', 'purchase', {
  transaction_id: 'T_12345',
  value: 49.00,
  currency: 'USD',
  items: [{ item_id: 'sku_pro', item_name: 'Pro plan', price: 49.00 }]
});

// Rybbit equivalent — items[] flattened to a string property
window.rybbit.customEvent('purchase', {
  transaction_id: 'T_12345',
  value: 49,
  currency: 'USD',
  plan: 'pro',
  item_skus: 'sku_pro'
});

For a scroll event, the auto-helper covers it; you only call customEvent if you want non-default thresholds:

window.rybbit.customEvent('scroll_50', { url: location.pathname });

And for a click on a key CTA, the cleanest pattern is a data-attribute that the auto-helper picks up without any JS handler at all:

<button data-rybbit-event="cta_pricing_click" data-rybbit-prop-plan="pro">
  Buy Pro
</button>

Properties are flat key-value, capped at five per event. If your GA4 schema relies on event.parameter.user.plan_tier-style nesting, you flatten it on the way out and accept that the join story belongs in ClickHouse SQL, not in the dashboard.

If your event list is custom, run our Event Mapping Wizard — paste a GA4 events export, get the matching Rybbit customEvent() snippet in two minutes.

Parallel-run setup: 14–21 days dual tagging

Both scripts fire client-side, sequentially. There is no server-side option in Rybbit at the time of writing, which simplifies the wiring but also means you cannot replay against a measurement-protocol endpoint the way you can with GA4. Practical setup: drop the Rybbit data-site-id snippet right after the existing gtag.js, deploy, watch both dashboards for 24 hours, then start the daily reconciliation log.

The expected gap on cutover day: Rybbit has no consent gate by default, so on a site that previously ran a GA4 cookie banner you should expect +5 % to +12 % more pageviews in Rybbit. That is the percentage of users who declined the banner and now get counted. The same magnitude shows up for Plausible, Umami, and Fathom — it is a banner artifact, not a Rybbit-specific quirk. Documented on the methodology page.

A reconciliation week-2 example from my stand:

Metric	GA4	Rybbit	Δ %	Status	Why
Pageviews	78,402	82,118	+4.74 %	yellow	banner declines
Sessions (visits)	49,210	52,084	+5.84 %	yellow	banner declines
Custom events	2,141	2,118	−1.07 %	green	mapping OK
Conversions (purchase)	312	309	−0.96 %	green	funnel intact
Revenue (USD)	$15,288	$15,141	−0.96 %	green	value-prop preserved

Figure 3. Five week-2 metrics on the tolerance axis. Custom events, conversions, and revenue land green — the value-bearing signals are intact. Pageviews and sessions sit yellow on the inflation side, exactly the banner-decline pattern documented in the methodology.

Skip the spreadsheet — feed two CSVs into the Parallel-Run Validator and it flags red cells automatically.

Test stand: Hetzner Cloud CX22 (€4.51/mo, 2 vCPU ARM, 4 GB RAM, Ubuntu 24.04 LTS) running Rybbit self-host via docker-compose (Rybbit + ClickHouse 24.x + Postgres 16 + Caddy reverse proxy with auto-TLS), against a Hugo 0.140 static site, 80 K pageviews over 3 weeks parallel-run April–May 2026, ~64 % EU traffic, GA4 baseline = production property no sampling. Both scripts client-side. Daily reconciliation. Raw CSVs at github.com/lucasbrandao/migrate-tests/run-058. Compose file at github.com/lucasbrandao/rybbit-stand.

Exporting GA4 history (and why Rybbit can't import it)

Plain version: Rybbit has no GA4 importer, no OAuth flow, no CSV uploader, and the maintainers have not signalled one is coming. Your live dashboard restarts at zero on cutover day. Plan around that, not against it.

Three paths for keeping the history accessible:

Path A — BigQuery archive (recommended). Before cutover, link your GA4 property to BigQuery if you have not already, run a one-shot full historical export, and keep it as cold storage. You query it with SQL when an exec asks "how did Q2 2025 compare." Storage cost on BigQuery for a year of typical-publisher data is under $5/month. The SQL pattern:

-- Archive 2024–2025 GA4 events to a long-term BQ table
CREATE TABLE `proj.archive.ga4_events_2024_2025`
PARTITION BY event_date
AS
SELECT event_date, event_name, user_pseudo_id,
       event_params, event_timestamp, geo, device
FROM `proj.analytics_NNNNNN.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20251231';

Path B — CSV dump to ClickHouse. If you want the history visible alongside Rybbit data, export GA4 events from BigQuery to CSV, normalize the schema to Rybbit's event table shape (created_at, site_id, session_id, url_path, referrer_domain, event_name, properties), and bulk-load with clickhouse-client --query="INSERT INTO event FORMAT CSV". Budget half a day including schema mapping. This is custom work; there is no plugin.

Path C — leave it in GA4 read-only. Stop sending new events but do not delete the property. GA4 retains 14 months of detailed data by default. After that window you have nothing, which is why Path A is the recommended one.

Cutover, four things that always break, and the Rybbit-shutdown contingency plan

Rybbit's container surface is small but not trivial — four Docker services and a reverse proxy. Four things have bitten me or the people I have helped migrate.

1. ClickHouse OOM on the 4 GB CX22 under traffic spikes. Default ClickHouse memory settings assume more RAM than the CX22 has. On a Tuesday-morning email-blast spike I watched the container OOM-kill itself twice. Fix: cap max_memory_usage_for_user at 2147483648 (2 GB) in the ClickHouse config and add a 2 GB swap file on the host. Permanent fix, five-minute change, do it on day one before you need it.

2. Caddy auto-TLS fails behind Cloudflare proxying. If your DNS is Cloudflare-proxied (orange cloud), Caddy's HTTP-01 challenge cannot reach the origin and TLS issuance silently fails. Either disable proxying for the analytics subdomain or switch Caddy to the DNS-01 challenge with a Cloudflare API token. The error message is unhelpful — you mostly notice when the dashboard 502s.

3. SPA route changes are not auto-tracked. Same problem Plausible, Umami, and Fathom have. Rybbit listens to full-page loads and the History API hooks, but if your router does pushState in a non-standard wrapper, you have to call rybbit.pageview() manually inside the router's afterEach. Two-line fix; if you skip it every visit shows as one pageview.

4. Hostname-pin drift after a domain change. The Rybbit site-id is bound to a hostname at create-time. If you migrate from www.example.com to example.com mid-test you will lose continuity unless you update the site config first. I learned this the hard way on run-052 — six hours of data went into the wrong bucket.

Rybbit-shutdown contingency. The project is under twelve months old. The maintainer team is small. AGPL-3.0 means the source survives any company-side outcome, but the convenience of "they ship updates" does not. The hedge plan I run on quarterly cron:

Postgres metadata — pg_dump rybbit_pg | gzip > backups/pg-$(date +%F).sql.gz
ClickHouse events — clickhouse-client --query="SELECT * FROM event FORMAT Native" > backups/ch-$(date +%F).native
Schema export — clickhouse-client --query="SHOW CREATE TABLE event" stored in git alongside the compose file
Warm fallback — Plausible Cloud trial account on standby, script tag staged behind a feature flag, so a swap is one deploy

If Rybbit's GitHub goes silent for six months, you fork the last good release, keep it patched, and accept that you are now on the maintenance hook. If that prospect is unacceptable, the right call is Plausible or Matomo — both have multi-year track records and active commercial entities behind them. Picking Rybbit is a calculated bet that the upside (GA4-shaped UI, ClickHouse, AGPL) is worth the maturity risk. Do the bet eyes-open.

FAQ

Is Rybbit production-ready?

Honestly, with caveats. Rybbit is under twelve months old, AGPL-3.0, and has roughly four thousand self-hosted instances reported on GitHub at the time of writing. The codebase is small enough to read, the schema is portable, and the pace of releases has been weekly. I would ship it for a content site or an internal dashboard today; I would not ship it as the system of record for a payments funnel without a fallback tag still firing. The risk is not "it breaks" — it is "the maintainer team gets pulled onto something else and updates stop."

What does the self-host actually cost in real money?

On a Hetzner Cloud CX22 in Falkenstein the box itself is €4.51/month for 2 vCPU and 4 GB RAM. ClickHouse storage on the local SSD is part of the box; budget another €2/month for off-box backups to Hetzner Storage Box if you want them. Total: under €7/month for sites doing under roughly two million events per month. Compared to the GA4 BigQuery export bill at the same volume, that cuts $1,840/yr at 2 M events/month. Above 2 M events you want to step up to a CX32 (€8.34/mo, 8 GB RAM) — still well under the GA4-equivalent line.

Can I keep my GA4 historical data after migrating to Rybbit?

Not natively. Rybbit has no GA4 importer and there is no plan to add one. The pragmatic path is to archive your GA4 history into BigQuery before Google's 14-month retention window kicks in, then query it as cold storage with SQL when needed. Your live Rybbit dashboard restarts at zero. Plan for that gap in your reporting deck. If you need the history visible inside Rybbit itself, you can dump GA4 events to CSV from BigQuery and bulk-load into the ClickHouse event table — half a day of custom work, no plugin available.

Does Rybbit work with consent banners?

Yes, and on the default cookieless setting most teams skip the banner entirely. Rybbit hashes the visitor signal daily without a persistent identifier, which under the EDPB's December 2024 guidance does not require consent in most EU jurisdictions. If your legal team still wants a banner for documented reasons, Rybbit honours the global data-do-not-track attribute the same way Plausible and Umami do — flip it from your existing CMP and the script falls silent.

How long should I run parallel before cutting over?

Fourteen days for a content site, twenty-one if you have e-commerce or any week-of-month seasonality. The reason is variance, not Rybbit-specific risk: you need at least two business cycles to know whether a delta is real drift or just a Tuesday spike. I have seen cleaner reconciliation in week two than in week one nine times out of ten. For Rybbit specifically, the extra week buys you a second weekend cycle to confirm the cookieless pageview inflation is consistent rather than a launch artifact.

What if Rybbit gets acquired or shuts down?

This is the real risk and the reason this page has a section on it. Rybbit's data lives in ClickHouse and Postgres — both portable, both dump-and-restore. The exit plan is a pg_dump plus a clickhouse-client SELECT INTO outfile, kept on a quarterly cron. AGPL-3.0 means anyone with the dump and the source tree can keep running it indefinitely as a fork. Practically, hedge by keeping a Plausible Cloud trial primed as a warm fallback so you can flip the script tag in one deploy if the project goes silent for six months. Picking Rybbit is a calculated bet — eyes open, exit primed.

Written by

Lucas Brandao

Analytics engineer · São Paulo · 11 years in data

Two Berlin SaaS migrations behind me. I write migrateanalytics.com as a public utility — no product, no affiliate, no consulting. All measurements are reproducible; raw data lives on GitHub.

GitHub LinkedIn Contact

v1 · 2026-05-05 · first publication. Test stand: Hetzner CX22 + Rybbit Docker, three weeks, 80 K pageviews. · edit on GitHub →

GA4 → Rybbit migration: self-host TCO, real event mapping, and the contingency plan if Rybbit goes away

Why teams move from GA4 to Rybbit (and who shouldn't)

What Rybbit replaces in GA4, and what it doesn't

Privacy, the cookie banner, and EU hosting

Mapping GA4 events to Rybbit events and properties

Parallel-run setup: 14–21 days dual tagging

Exporting GA4 history (and why Rybbit can't import it)

Cutover, four things that always break, and the Rybbit-shutdown contingency plan

FAQ

Related migrations