GA4 β Rybbit migration: self-host TCO, real event mapping, and the contingency plan if Rybbit goes away
Rybbit is the new arrival in the open-source analytics shelf β under twelve months old, AGPL-3.0, ClickHouse-backed, with a dashboard that copies GA4's information density rather than fighting it. That makes it the most "GA4-shaped" alternative I have tested. It also means Rybbit is the riskiest pick on maturity grounds, so this page front-loads two things the other migrations bury: the real Hetzner self-host TCO and the exit plan for the day Rybbit's GitHub goes quiet. Test stand below: a Hetzner CX22 running docker compose up, a Hugo site, 80,000 pageviews, three weeks of parallel data, and the schema dump I keep on a quarterly cron.
Why teams move from GA4 to Rybbit (and who shouldn't)
Three triggers, all narrower than the Plausible or Matomo audience. The first is GA4-ergonomic burnout β engineers who liked Universal Analytics, never made peace with GA4's report-builder, and want a dashboard that shows pageviews in three clicks instead of seven. Rybbit's UI is the closest thing in the open-source space to "what GA4 would look like if Google had not over-engineered it."
The second is privacy without the lecture β a cookieless default that does not require a consent banner under current EDPB guidance, but with enough event flexibility to cover an e-commerce funnel. Plausible and Umami have the cookieless story too; Rybbit's distinctive contribution is keeping the event model close enough to GA4 that the migration spreadsheet is short.
The third is GA4 BigQuery cost. At 2 M events/month I was paying around $155/month for streaming exports plus storage. The Hetzner CX22 self-host runs β¬4.51/month β the math is not subtle. Cuts $1,840/yr at 2M events.
If you are not in those buckets, save the project. Rybbit is the wrong move when you do under 1,000 visits/day (use Plausible Cloud or the umami.is free tier β the self-host overhead is not worth it), when you have a Looker Studio reporting pipeline you cannot rebuild (Rybbit has no native Looker connector), when AdSense or Google Ads attribution is load-bearing in your reports, or when your team has zero DevOps capacity to keep a Docker host alive. The "free as in beer" framing is misleading: you trade dollars for hours, and someone has to do the upgrades.
What Rybbit replaces in GA4, and what it doesn't
The honest matrix. Rybbit gets you most of the GA4 surface that engineers actually use, plus the parts Plausible omits.
| Capability | GA4 | Rybbit | Plausible | Notes |
|---|---|---|---|---|
| Pageviews / sessions | β | β | β | Rybbit uses 30-min window, matches GA4 |
| Custom events | β | β JS API | β goal-based | customEvent({name, properties}) |
| Funnels (multi-step) | β | β | Cloud paid only | built-in, no goal-creation step |
| Goals / conversions | β | β | β | Rybbit goals derive from custom events |
| Revenue tracking | β | β | partial | last-click, no MTA model |
| License | proprietary | AGPL-3.0 | AGPL-3.0 | both copyleft, fork-friendly |
| Database | BigQuery | ClickHouse + Postgres | ClickHouse + Postgres | portable schema, dump/restore works |
| BigQuery export | β | β | β | SQL-direct on ClickHouse instead |
| Looker Studio connector | β | β | β | community-built only |
| AdSense / Ads attribution | β | β | β | not on roadmap |
| Heatmaps / session replay | partial | β | β | none of the three OSS tools |
| Audiences / segments | β | partial | β | filter-based, not stored cohorts |
If your weekly workflow includes "schedule a Looker Studio email to the marketing director," Rybbit is not your tool β you would either build a Grafana dashboard against ClickHouse or stay on GA4. If your funnel reporting needs multi-touch attribution across paid and organic, the same caveat applies. Rybbit is for teams whose analytics question is "what happened on the site this week," not "what was the assisted-conversion path of the August campaign."
Privacy, the cookie banner, and EU hosting
Rybbit's default mode is cookieless: no _ga, no fingerprint, no persistent identifier across days. Visitor identification is a daily-rotating salted hash of (IP + user-agent + site-id), which under the EDPB's December 2024 guidance on cookieless trackers does not require consent in most EU jurisdictions. The current Schrems II posture treats US-hosted analytics as a transfer risk for personal data; Rybbit self-hosted in Hetzner Falkenstein (FSN1) sidesteps that entirely.
The pragmatic recipe I use: pick the Falkenstein region for the CX22, set the Caddy auto-TLS to your apex domain, set RYBBIT_PRIVACY_LEVEL=strict, and skip the consent banner. If your legal team insists on one anyway, Rybbit honours a global data-do-not-track attribute the same way Plausible and Umami do β flip it from your existing CMP and the script falls silent. One overused phrase warning: this is not "privacy-first" magic, it is just less data being collected. The right framing is "less footprint" rather than a marketing posture.
Mapping GA4 events to Rybbit events and properties
This is where Rybbit differs most visibly from Plausible (UI-first goals) and Umami (four auto-events). Rybbit ships two primary tracking calls β pageview() automatic on script load, and customEvent({name, properties}) for everything else β plus a small set of attribute-driven helpers for outbound clicks and downloads.
customEvent() for everything else. Properties are flat key-value, capped at five per event. user_engagement has no Rybbit equivalent and gets dropped.| GA4 event | Rybbit call | Properties / notes |
|---|---|---|
page_view | pageview() auto | fires on script load + History API hooks |
scroll | auto-helper | default 90 % threshold, configurable |
click (outbound) | auto-helper | matches external <a href> |
file_download | auto-helper | extension list editable in dashboard |
purchase | customEvent('purchase', {β¦}) | value, currency, transaction_id, plan |
begin_checkout | customEvent('checkout_start') | step 1 of funnel |
add_to_cart | customEvent('add_to_cart') | flatten item list to JSON string |
video_progress | customEvent('video_25/50/75/100') | one event per quartile, manual |
user_engagement | β | dropped, no equivalent |
login / sign_up | customEvent | method as property |
| scoped custom dim. | flatten to property | no user-scoped joins |
| session-scoped attr. | β | dropped, write event-scoped instead |
Three categories of pain show up consistently: nested item arrays (e-commerce items[] has to be JSON-stringified into a single property), user-scoped dimensions (Rybbit has no user store, so plan-tier and lifetime-value have to ride along on every event that needs them), and engagement heuristics (the GA4 user_engagement 10-second-active timer has no clean port). Of 100 GA4 events I mapped on the test stand, 89 mapped cleanly via auto-helpers or one customEvent call, 24 needed manual code (the e-commerce funnel and video quartiles), and 7 were dropped β user_engagement, two session-scoped audience signals, and four custom-dimension chains that depended on user-store joins Rybbit does not offer. That ratio sits between Fathom's 98/18/4 and Plausible's 87/23/10 β roughly comparable to Umami's 92/21/7.
The before/after JSON for a purchase event:
// GA4 dataLayer push
gtag('event', 'purchase', {
transaction_id: 'T_12345',
value: 49.00,
currency: 'USD',
items: [{ item_id: 'sku_pro', item_name: 'Pro plan', price: 49.00 }]
});
// Rybbit equivalent β items[] flattened to a string property
window.rybbit.customEvent('purchase', {
transaction_id: 'T_12345',
value: 49,
currency: 'USD',
plan: 'pro',
item_skus: 'sku_pro'
});
For a scroll event, the auto-helper covers it; you only call customEvent if you want non-default thresholds:
window.rybbit.customEvent('scroll_50', { url: location.pathname });
And for a click on a key CTA, the cleanest pattern is a data-attribute that the auto-helper picks up without any JS handler at all:
<button data-rybbit-event="cta_pricing_click" data-rybbit-prop-plan="pro">
Buy Pro
</button>
Properties are flat key-value, capped at five per event. If your GA4 schema relies on event.parameter.user.plan_tier-style nesting, you flatten it on the way out and accept that the join story belongs in ClickHouse SQL, not in the dashboard.
customEvent() snippet in two minutes.Parallel-run setup: 14β21 days dual tagging
Both scripts fire client-side, sequentially. There is no server-side option in Rybbit at the time of writing, which simplifies the wiring but also means you cannot replay against a measurement-protocol endpoint the way you can with GA4. Practical setup: drop the Rybbit data-site-id snippet right after the existing gtag.js, deploy, watch both dashboards for 24 hours, then start the daily reconciliation log.
The expected gap on cutover day: Rybbit has no consent gate by default, so on a site that previously ran a GA4 cookie banner you should expect +5 % to +12 % more pageviews in Rybbit. That is the percentage of users who declined the banner and now get counted. The same magnitude shows up for Plausible, Umami, and Fathom β it is a banner artifact, not a Rybbit-specific quirk. Documented on the methodology page.
A reconciliation week-2 example from my stand:
| Metric | GA4 | Rybbit | Ξ % | Status | Why |
|---|---|---|---|---|---|
| Pageviews | 78,402 | 82,118 | +4.74 % | yellow | banner declines |
| Sessions (visits) | 49,210 | 52,084 | +5.84 % | yellow | banner declines |
| Custom events | 2,141 | 2,118 | β1.07 % | green | mapping OK |
| Conversions (purchase) | 312 | 309 | β0.96 % | green | funnel intact |
| Revenue (USD) | $15,288 | $15,141 | β0.96 % | green | value-prop preserved |
Exporting GA4 history (and why Rybbit can't import it)
Plain version: Rybbit has no GA4 importer, no OAuth flow, no CSV uploader, and the maintainers have not signalled one is coming. Your live dashboard restarts at zero on cutover day. Plan around that, not against it.
Three paths for keeping the history accessible:
Path A β BigQuery archive (recommended). Before cutover, link your GA4 property to BigQuery if you have not already, run a one-shot full historical export, and keep it as cold storage. You query it with SQL when an exec asks "how did Q2 2025 compare." Storage cost on BigQuery for a year of typical-publisher data is under $5/month. The SQL pattern:
-- Archive 2024β2025 GA4 events to a long-term BQ table
CREATE TABLE `proj.archive.ga4_events_2024_2025`
PARTITION BY event_date
AS
SELECT event_date, event_name, user_pseudo_id,
event_params, event_timestamp, geo, device
FROM `proj.analytics_NNNNNN.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20251231';
Path B β CSV dump to ClickHouse. If you want the history visible alongside Rybbit data, export GA4 events from BigQuery to CSV, normalize the schema to Rybbit's event table shape (created_at, site_id, session_id, url_path, referrer_domain, event_name, properties), and bulk-load with clickhouse-client --query="INSERT INTO event FORMAT CSV". Budget half a day including schema mapping. This is custom work; there is no plugin.
Path C β leave it in GA4 read-only. Stop sending new events but do not delete the property. GA4 retains 14 months of detailed data by default. After that window you have nothing, which is why Path A is the recommended one.
Cutover, four things that always break, and the Rybbit-shutdown contingency plan
Rybbit's container surface is small but not trivial β four Docker services and a reverse proxy. Four things have bitten me or the people I have helped migrate.
1. ClickHouse OOM on the 4 GB CX22 under traffic spikes. Default ClickHouse memory settings assume more RAM than the CX22 has. On a Tuesday-morning email-blast spike I watched the container OOM-kill itself twice. Fix: cap max_memory_usage_for_user at 2147483648 (2 GB) in the ClickHouse config and add a 2 GB swap file on the host. Permanent fix, five-minute change, do it on day one before you need it.
2. Caddy auto-TLS fails behind Cloudflare proxying. If your DNS is Cloudflare-proxied (orange cloud), Caddy's HTTP-01 challenge cannot reach the origin and TLS issuance silently fails. Either disable proxying for the analytics subdomain or switch Caddy to the DNS-01 challenge with a Cloudflare API token. The error message is unhelpful β you mostly notice when the dashboard 502s.
3. SPA route changes are not auto-tracked. Same problem Plausible, Umami, and Fathom have. Rybbit listens to full-page loads and the History API hooks, but if your router does pushState in a non-standard wrapper, you have to call rybbit.pageview() manually inside the router's afterEach. Two-line fix; if you skip it every visit shows as one pageview.
4. Hostname-pin drift after a domain change. The Rybbit site-id is bound to a hostname at create-time. If you migrate from www.example.com to example.com mid-test you will lose continuity unless you update the site config first. I learned this the hard way on run-052 β six hours of data went into the wrong bucket.
Rybbit-shutdown contingency. The project is under twelve months old. The maintainer team is small. AGPL-3.0 means the source survives any company-side outcome, but the convenience of "they ship updates" does not. The hedge plan I run on quarterly cron:
- Postgres metadata β
pg_dump rybbit_pg | gzip > backups/pg-$(date +%F).sql.gz - ClickHouse events β
clickhouse-client --query="SELECT * FROM event FORMAT Native" > backups/ch-$(date +%F).native - Schema export β
clickhouse-client --query="SHOW CREATE TABLE event"stored in git alongside the compose file - Warm fallback β Plausible Cloud trial account on standby, script tag staged behind a feature flag, so a swap is one deploy
If Rybbit's GitHub goes silent for six months, you fork the last good release, keep it patched, and accept that you are now on the maintenance hook. If that prospect is unacceptable, the right call is Plausible or Matomo β both have multi-year track records and active commercial entities behind them. Picking Rybbit is a calculated bet that the upside (GA4-shaped UI, ClickHouse, AGPL) is worth the maturity risk. Do the bet eyes-open.
FAQ
Is Rybbit production-ready?
What does the self-host actually cost in real money?
Can I keep my GA4 historical data after migrating to Rybbit?
event table β half a day of custom work, no plugin available.Does Rybbit work with consent banners?
data-do-not-track attribute the same way Plausible and Umami do β flip it from your existing CMP and the script falls silent.How long should I run parallel before cutting over?
What if Rybbit gets acquired or shuts down?
pg_dump plus a clickhouse-client SELECT INTO outfile, kept on a quarterly cron. AGPL-3.0 means anyone with the dump and the source tree can keep running it indefinitely as a fork. Practically, hedge by keeping a Plausible Cloud trial primed as a warm fallback so you can flip the script tag in one deploy if the project goes silent for six months. Picking Rybbit is a calculated bet β eyes open, exit primed.