Built for dbt teams

AutoDBTTM

Half your dbt code is dead.

We bury it safely.

For data leaders who know something's wrong, but can't quantify it yet.

Start now See how it works

PR #1842 · cleanup/merge-fct-orders

● OpenMerge fct_orders_enriched into fct_orders

Zzingle-bot opened this PR · 2 minutes ago

Why this PR

95% SQL similarity, only filter scope differs

7 downstream consumers, all preserved

Saves $3,200/mo in warehouse spend

Proof attached

Equivalence report · 100% match

Blast radius · 17 readers verified

Backfill plan · idempotent

$47k

average monthly warehouse spend recovered

per team, in first 90 days

40%

of dbt models are never read after 90 days

sitting in your repo right now, costing money

minutes

from connecting your repo to your first PR

not days. AutoDBT scans and acts immediately

Why it matters

Three problems every data team has.
One tool that fixes all of them.

The bill nobody can explain

Spend went up 12% last month. Nobody knows which model caused it. AutoDBT traces every dollar to the model that spent it.

Dead code + duplicates, both priced and ranked

The change nobody wants to make

3 models doing the same thing. Nobody touches them — the last person who tried broke a CFO dashboard. AutoDBT proves the fix is safe before anyone merges.

Row-level equivalence before every merge

The mess that comes back

You cleaned it up 18 months ago. Standards drifted. It's back. AutoDBT blocks violations at PR time, before they reach main.

Layer rules enforced on every PR, no exceptions

How it stays safe

AutoDBT opens PRs. Your team merges them.

Nothing is automated end to end. Every change is proposed, proven, and reviewed before it ships. Your team stays in control.

AutoDBT scans

Reads your repo and connects to your compute engine. Finds duplicates, dead code, and drift. Ranks everything by cost.

Opens a PR with proof

For each fix, it opens a GitHub PR with an equivalence report, blast radius map, and backfill plan attached. If anything looks off, the PR is blocked automatically.

Your engineer reviews and merges

Your team reads the proof, runs your existing CI, and merges when they are confident. Nothing ships without a human approving it.

Nothing automatic

AutoDBT never merges on its own. It opens a PR, attaches the proof, and waits. Your team decides what ships and when.

Validation runs on all steps

Equivalence checks, blast radius scans, and layer rule enforcement run at every stage. If anything fails, the PR is blocked before your team sees it.

Controlled refactor

Every change is scoped, reversible, and tied to a PR. 30-day quarantine before any deletion. One command to roll back anything, anytime.

Find what's hiding

See the debt in your repo.

Duplicates. Drift. Dead code. Mapped, ranked, and priced across your full model graph.

Catch near-duplicates

95% the same. One different filter. That's why your numbers don't tie.

Spot stale logic

One model updates. Siblings don't. We flag every stale copy.

Clear the graveyard

Zero reads in 90 days. Unused macros. See what they cost.

Cluster analysis

Similarity Report · fct_orders cluster

fct_orders_v2.sql

SELECT order_id,

customer_id, revenue_usd

FROM stg_orders

WHERE status = 'complete'

AND region = 'US'

fct_orders_enriched.sql

SELECT order_id,

customer_id, revenue_usd

FROM stg_orders

WHERE status IN ('complete',

'shipped', 'delivered')

95% similar7 downstream$3,200/mo overlap

3 clusters detected this week

Cluster	Models	Similarity	$ overlap
fct_orders	3	95%	$3,200
dim_users	4	88%	$1,840
fct_sessions	2	82%	$720

Find what's hiding

Duplicates. Drift. Dead code. Mapped, ranked, and priced across your full model graph.

Cluster analysis

Finds models sharing 80–99% of their SQL. Ranked by downstream impact and cost overlap.

Logic drift scan

One model updates. Siblings don't. We catch the drift before it hits the dashboard.

Dead code map

90 days zero reads. We list every model, macro, and source. Sorted by what it costs you.

Cost attribution

Warehouse spend, model by model. The expensive ones are usually the ones nobody owns.

Keep it clean

Enforce it as you grow.

Standards enforced at PR time. Backlog ranked by dollars. Every change reversible.

Block at the PR

No BI reading from stg_. Enforced on every PR.

Top of the queue first

Ranked by dollars saved. Always know what's next.

Undo anything

30-day quarantine. One-command rollback. No regrets.

Layer Enforcement

CI · Zingle Layer Enforcement

✗

Layer violation detected

Looker explore 'Order Analytics' reads from stg_orders. Move to fct_orders.

✓

Duplicate check

No new duplicates introduced in this PR.

✓

Dead code scan

All new models have at least one downstream consumer.

Layer rules · auto-inferred

1 # raw → staging → intermediate → marts → exposures

2 stg_* → may read raw_*

3 int_* → may read stg_*, int_*

4 fct_*, dim_* → may read int_*, stg_*

5 exposures → may read fct_*, dim_*

Keep it clean

Standards enforced at PR time. Backlog ranked by dollars. Every change reversible.

Layer Enforcement

Five layers. Hard rules. Cross-layer imports get blocked before merge.

Cost-Ranked Backlog

Sorted by dollars saved. Always know what's next, what's safe, and what's worth it.

30-Day Quarantine

Soft-delete first. Anyone shouts in 30 days, restore in one click. No-one shouts? It's gone.

One-Command Rollback

Every change reversible. zingle rollback <pr>. Done in 60 seconds.

Book a demo

See what's hiding in your repo.

Point us at your dbt project. We'll show you the rest.

Start now

AutoDBTTM

Three problems every data team has. One tool that fixes all of them.

The bill nobody can explain

The change nobody wants to make

The mess that comes back

AutoDBT opens PRs. Your team merges them.

See the debt in your repo.

Catch near-duplicates

Spot stale logic

Clear the graveyard

Cluster analysis

Logic drift scan

Dead code map

Cost attribution

Enforce it as you grow.

Block at the PR

Top of the queue first

Undo anything

Layer Enforcement

Cost-Ranked Backlog

30-Day Quarantine

One-Command Rollback

See what's hiding in your repo.

Three problems every data team has.
One tool that fixes all of them.