Backfill historical events
Insert events for an existing agent + workspace combo, with realistic dates and computed hours_saved + cost_saved. Use this when a customer onboards on day N and wants the dashboard to also show N-30 days of activity, or after a webhook outage when the in-flight events were not retried.
When to use this
- Customer asks "can you add the X blog posts I wrote last month, since I'm just integrating now?"
- A workflow that should have hit
/v1/trackbut skipped the call entirely (not retryable, never reached the API) - Internal demo-data seeding for a non-production workspace
Do not use this for events that already came in via the API — the idempotency_key strategy below dedupes against existing rows but reposting via the proper API path is always preferable when possible.
Step 1: gather the inputs
You need:
- org_id of the workspace receiving the events (find via
select id, name from organizations where name ilike '%customer%';) - agent_id_text (the slug, like
bilingual-blog-agent) — created automatically if it does not exist yet - task_type_key (must already exist either as
is_builtin: truerow, an org-specific custom row, or have a baseline override on the org) - The list of events with
occurred_at(date) plus any per-event metadata you want to record
For the baseline math:
-- What baseline does the org's events actually use for this task type?
select
coalesce(
(select baseline_minutes_override from task_type_overrides
where org_id = '$ORG' and task_type_key = '$KEY'),
(select default_baseline_minutes from task_types
where (org_id = '$ORG' or org_id is null)
and key = '$KEY'
order by org_id nulls last
limit 1)
) as resolved_baseline_minutes,
(select default_hourly_rate from organizations where id = '$ORG') as hourly_rate;
If the override exists, set resolved_baseline_source = 'override'. If it falls back to the org-specific custom row, use 'custom'. If it falls through to the global builtin, use 'builtin'.
Step 2: run the insert script
Save as a one-off Node script in the repo root (or run inline via heredoc):
const { createClient } = require("@supabase/supabase-js");
const c = createClient(process.env.URL, process.env.KEY);
const ORG_ID = "...";
const AGENT_ID = "...";
const articles = [
{ id: 158, date: "2026-05-07", title: "..." },
// ...
];
(async () => {
// Upsert agent (auto-create on first event normally happens in
// /v1/track; we mirror that here)
const firstSeenAt = articles[articles.length - 1].date + "T12:00:00.000Z";
const lastEventAt = articles[0].date + "T12:00:00.000Z";
const { data: agent } = await c
.from("agents")
.upsert(
{
org_id: ORG_ID,
agent_id: AGENT_ID,
first_seen_at: firstSeenAt,
last_event_at: lastEventAt,
total_events: articles.length,
},
{ onConflict: "org_id,agent_id" },
)
.select("id")
.single();
// Build event rows. Generated columns (hours_saved, cost_saved,
// net_saved) are computed by Postgres from the inputs we DO set.
const rows = articles.map((a) => ({
org_id: ORG_ID,
agent_id_text: AGENT_ID,
agent_uuid: agent.id,
task_type_key: "blog_long_draft",
outcome: "success",
human_baseline_minutes: 195, // resolved per Step 1
hourly_rate: 47.5, // resolved per Step 1
currency: "EUR",
resolved_baseline_source: "override",
occurred_at: a.date + "T12:00:00.000Z",
idempotency_key: `backfill:<workspace-slug>:${AGENT_ID}:${a.id}`,
metadata: { article_id: a.id, title: a.title, source: "backfill_2026_05_10" },
fingerprint: {},
}));
await c.from("events").insert(rows);
})();
Step 3: verify
In the dashboard:
- Switch to the workspace receiving the backfill
- Open
/agents/<agent_id>and confirmLIFETIME EVENTSmatches the count - Set
/overviewperiod to ALL and confirmEVENTSmatches - Spot-check
RECENT EVENTSfor the right dates
In SQL:
select count(*), sum(hours_saved), sum(cost_saved)
from events
where org_id = '$ORG'
and agent_id_text = '$AGENT'
and metadata ->> 'source' = 'backfill_2026_05_10';
Idempotency strategy
The idempotency_key format backfill:<workspace-slug>:<agent_id>:<deterministic-suffix> is deterministic, which means accidentally running the same script twice silently dedupes via the unique constraint on idempotency_key. Use a deterministic suffix (article id, slug, anything stable) — never Date.now() or a UUID, which would defeat the dedupe.
Pooled-quota implication for Agency
Backfill events count against the anchor's pooled quota, even when inserted into a child workspace. Be explicit in conversations with the customer that historical backfill nudges them closer to overage on the anchor's billing cycle. The math is per-event identical to live events.