HumanHours Enrichment Engine, v1 Design · HumanHours docs

Date: 2026-05-31 Status: Design approved by Ralf, pending written-spec review Repo: ~/projects/agent-metrics (HumanHours, Next.js 16 + Supabase EU + Stripe monorepo)

1. Summary

Add an outside-in company-research engine to HumanHours. Given one or many company domains, the engine returns an enriched company record: company profile, estimated headcount per role, fully-loaded labour cost per role, total annual labour cost, and an AI-automation business case, each with a confidence score and cited sources.

Enriched companies live in a per-org library. Users keep, view, download (CSV/JSON), and pull them via the API to drive outreach campaigns. That outreach use case, file + API access to enriched companies, is the primary product goal. The CFO-style business case rides along on the same record.

This is a native rebuild of the 7-workflow n8n research engine built by Chris Anejo and Esther Ayodele. n8n is retired. We do not port 1:1; we keep what works (the pipeline shape, the validated ~80% logic) and fix the technical debt (see §3).

2. Scope

In scope (v1)

Single + bulk company enrichment from a domain.
Per-org company library with ownership, tags, and an external_id for outreach matching.
Download as file (CSV + JSON) and a list/pull API for outreach.
Confidence per datapoint, rolled up cost-weighted.
Lookup-based quota and pricing (new "research lookups" dimension, hard cap per plan, upgrade to raise, no overage billing).
A code-based accuracy eval gating >=80% against a versioned ground-truth set.

Out of scope (v1, explicitly deferred)

PowerPoint / branded deck export (outreach needs file + API, not a deck). Later.
Tying the estimate into the existing track/measure loop (the role -> task_type "vooraf = achteraf, 87% verified" moat). Later, designed not to be blocked by v1.
Lifecycle phases and full two-dimensional ROI (value creation). Light/later.
Licensed enrichment APIs (Apollo/Coresignal/PDL, KVK/Companies House). Adapter seam designed in now; integration later.
Seniority (junior/medior/senior) as an exposed dimension. Schema is seniority-ready (see §5) but v1 uses one blended loaded wage per role.

3. Decisions log

Approved with Ralf on 2026-05-31:

v1 = standalone enrichment product. Library + download + API for outreach; the measure-loop integration comes later.
Data source = port the n8n approach (LLM research + official labour statistics + seeded survey reference tables), with an adapter seam for licensed APIs later. Global, not country-limited: wages resolve on-demand via grounded research for any country/role (Sonar fetches the prevailing wage with a source), cached per country/role in wages so it is paid once. Seeded reference tables are an optional fast-path, never a restriction to a fixed country set.
Bulk infra = existing pg_cron + claim-based worker (SKIP LOCKED), no Inngest.
Lookup billing = consume a lookup on a new company or a user-requested refresh. Existing enriched companies are always free to view/download/API. Stale data shows an "X days old, refresh?" badge; refresh is user-initiated. No auto-refresh charging. No overage billing: each plan has a hard monthly lookup cap; when it is reached the enrich/refresh endpoints return an upgrade-required error. The user upgrades to raise the cap. Bundles: Free 10 one-time, Pro 100/mo, Agency 500/mo pooled, Enterprise custom.
Seniority = blended now, schema seniority-ready.
Research grounding = Perplexity Sonar (perplexity/sonar via OpenRouter), real web search with citations, not a model-only call. This matches the original Asana intent; the n8n node was misnamed and actually ran Claude without web grounding.

Improvements over the n8n flows (not a 1:1 port)

Deterministic core, LLM only at the edges. All math (wage lookup, loaded-cost, headcount distribution, automation savings, aggregation, confidence rollup) moves to pure, unit-tested functions in packages/core. The LLM does only two bounded jobs: grounded research/extraction, and the business-case narrative.
Citable reference data instead of hardcoded tables. n8n hid wage figures and employer factors in Code nodes with live stat-calls disabled. We make them versioned, seeded reference tables with a source per datapoint. This is the Methodology moat.
Cache-first that actually runs. n8n left cache-read nodes disabled, so every run re-ran the LLM. Cache-first is a tested first-class path here.
Confidence as a real source-tier model. Never silently floor a wage to 0 (the weakest dimension in n8n validation). Explicit tiers, cost-weighted rollup.
Code-based eval in CI against a versioned ground-truth fixture, not Google Sheets with half-disabled LLM comparisons.
One clean path, one provider. Drop duplicate/disabled nodes, HTTP-vs-subworkflow variants, and the two-credential mess (n8n used both "HumanHours" and "Zorgservice XL" OpenRouter creds). One OpenRouter config routing to perplexity/sonar + Claude.

4. Architecture

Client / API ──► /v1/companies (sync, 1)            ┐
            └──► /v1/companies/bulk (async, many) ──┤
                                                    ▼
                                          enrichment service
                                          (apps/web/lib/enrichment)
                                                    │
      ┌─────────────────────────────────────────────┼─────────────────────────────┐
      ▼                         ▼                     ▼                             ▼
 company_research        wages cache           reference data                deterministic core
 cache (global,          (global, per           (seeded, versioned:          (packages/core/enrichment)
 per domain, TTL)        country/role/year)     wages, employer factors,     pure fns: normalise roles,
      │                       │                 role distributions,          loaded-cost, aggregate,
      │                       │                 automation rates)            automation savings,
      ▼                       ▼                                              confidence rollup
  Sonar grounded         official stats
  research + Claude       (Eurostat/CBS/ECB)
  extraction (LLM edge)   + seeded fallback
                                                    │
                                                    ▼
                                          org_companies (per-org library, RLS)
                                          company_lookups (billing ledger)

Background bulk: pg_cron triggers /api/cron/enrichment-worker every ~1 min, which claims queued enrichment_job_items with SKIP LOCKED, runs the same enrichment service per domain, writes to org_companies, and updates job counts with retry/backoff.

Deterministic core vs LLM edges

LLM edge 1, grounded research: perplexity/sonar searches the web and returns company facts with citations. Claude then extracts these into the fixed JSON shape. Output: legal name, country, industry, headcount estimate, departments/roles, channels, languages, and a source URL per material fact.
Deterministic core (packages/core/enrichment/): role normalisation to a canonical set, headcount distribution from a seeded industry->role-distribution benchmark (LLM fills only gaps), wage lookup (cache-first on wages, then official stat, then seeded reference), loaded-cost via one canonical employer-factor function, annual cost aggregation, automation savings via seeded role automation-rates, net-ROI (subtract agent cost), and cost-weighted confidence rollup. All unit-tested, reproducible.
LLM edge 2, narrative: Claude writes the business-case summary, top automation opportunity, and risk factors from the deterministic numbers. Text only, never math.

5. Data model

New migrations in packages/db/migrations/, RLS on every org-scoped table, following existing conventions (uuid PKs, updated_at trigger, generated columns where it helps).

`company_research` (global cache, service-role only, not org-scoped)

One research result per domain, reused across all orgs. This is the cost-saver; it is never exposed directly and does not affect billing.

id uuid pk, domain text unique
company_data jsonb (profile + sources), roles jsonb (per-role headcount + seniority_mix?), wages jsonb, totals jsonb, business_case jsonb, confidence jsonb
model_versions jsonb (which Sonar/Claude/reference-data versions produced this)
researched_at timestamptz, ttl_days int (default: company 30)
sources jsonb (cited URLs)

roles[] and wages jsonb shapes are seniority-ready: wage_data can later carry { blended, junior, medior, senior } and a role can carry seniority_mix. v1 populates blended only.

`org_companies` (per-org library, RLS)

What the user owns, views, downloads, and pulls via API.

id uuid pk, org_id uuid fk, domain text
research_id uuid fk -> company_research
external_id text null (customer's own id for outreach matching), tags text[]
added_at timestamptz, last_refreshed_at timestamptz
unique (org_id, domain)

`wages` (global cache, service-role only)

id uuid pk, country text, role text, year int
wage_data jsonb (seniority-ready), source text, source_url text, confidence numeric
last_researched timestamptz, ttl_days int (default: wages 90)
unique (country, role, year)

`enrichment_jobs` (per-org, RLS)

id uuid pk, org_id uuid fk, status text (queued|running|done|failed)
input_count int, accepted int, rejected int, duplicates_removed int
created_at, started_at, finished_at timestamptz

`enrichment_job_items`

id uuid pk, job_id uuid fk, org_id uuid fk, domain text
status text (queued|running|done|failed), attempts int, retry_after timestamptz
error text, created_at, updated_at
index on (status, retry_after) for the claim query

`company_lookups` (billing ledger, append-only, per-org)

Drives metering, tier-bundle counting, and audit. One row per charged lookup.

id uuid pk, org_id uuid fk, domain text, kind text (new|refresh)
billing_period text (e.g. 2026-05), created_at timestamptz
index on (org_id, billing_period)

Reference data (seeded, versioned, the Methodology)

Seeded via packages/db/scripts/ and/or migrations, each row cites a source:

employer-cost factors per country (NL 1.40, DE 1.21, UK 1.25, US 1.30, FR 1.45, BE 1.35, ...)
industry -> role-distribution benchmarks
role automation-rates (cited, e.g. McKinsey MGI 2025)
seeded national wage reference tables (fallback when live stats are unavailable)

`refresh_log`

Audit of any reference-data / cache-warming refresh runs.

Confidence model

Per datapoint a source tier (highest to lowest): fetched+cited > official statistic > seeded reference table > LLM-inferred > hard fallback. Each tier maps to a confidence band; the overall company confidence is the cost-weighted average across roles/wages. Never silently emit 0; emit an explicit low-confidence / unknown state instead.

6. API surface

All under /v1, Node runtime, authenticated with existing api_keys (argon2id) and gated by plan-gate.ts. New scopes: companies:read, companies:write. Same machine-readable error format { error: { code, message, field?, hint? } }.

POST /v1/companies — enrich one domain. Charges a lookup if the company is new to this org (or ?refresh=true). Returns the enriched record, or 202 + job_id if the research is cold and runs async.
POST /v1/companies/bulk — body { domains: [...] } or CSV upload. Normalises + dedupes, creates a job + queued items, returns 202 { job_id, accepted, rejected, duplicates_removed }.
GET /v1/companies — the org library, paginated, filter by tag/added_at, format=json|csv|ndjson. This is the outreach pull.
GET /v1/companies/{domain} — one enriched record.
POST /v1/companies/{domain}/refresh — user-requested refresh, charges a lookup.
GET /v1/companies/export — whole (or filtered) library as a downloadable file.
GET /v1/jobs/{id} — bulk job status (% complete, counts).

Lookup charging rules

A lookup is consumed when a domain is added to an org's library (new) or refreshed.
Serving from the global company_research cache still consumes a lookup for the org (it is new data to them); the cache only saves us the LLM cost. That margin is intentional.
Viewing, listing, downloading, and API-pulling existing library entries is always free.
Hard cap: if the org's lookups this billing period have reached the plan bundle, the enrich/refresh endpoints return 402 { error.code = "lookup_quota_exceeded", hint: "upgrade your plan" }. No overage is billed. Caps: Free 10 one-time, Pro 100/mo, Agency 500/mo pooled across the agency tree, Enterprise custom.

7. Bulk processing

enrichment_job_items is a queue. /api/cron/enrichment-worker (pg_cron, ~1 min) claims a batch with UPDATE ... SET status='running' ... WHERE status='queued' AND (retry_after IS NULL OR retry_after <= now()) ORDER BY created_at LIMIT N FOR UPDATE SKIP LOCKED, runs the enrichment service per item, writes results, updates parent job counts, and sets the job done when no queued/running items remain.
Failures: exponential backoff (1/5/15 min), max 3 attempts, then failed with the error recorded. Configurable batch size and concurrency.
No auto-reindex of org libraries (per decision 4). Optional later: opportunistic global-cache warming for popular domains.

8. Pricing and quota (no overage)

New "research lookups" dimension is a hard monthly cap per plan, not metered overage. Counted from the company_lookups ledger for the current billing period.
Tier caps in plan-gate.ts: Free 10 one-time, Pro 100/mo, Agency 500/mo pooled across the agency tree, Enterprise custom.
When the cap is reached, enrich/refresh returns 402 lookup_quota_exceeded with an upgrade hint. No humanhours_lookup Stripe meter and no overage cron: the user upgrades to raise the cap. (The existing event-overage meter humanhours_event is untouched.)
Cache marketed as a feature: "rerun free, you only pay once" (per-org ownership).
Price migration Pro EUR 19 -> 49 and Agency EUR 99 -> 249 with 12-month grandfathering is its own guarded sub-step: create new Stripe price objects, keep existing customers on grandfathered prices, migrate new signups first. Treated as a careful billing-ops task, sequenced after the quota lands.

9. Frontend

New authed section app/(app)/companies/, using the existing dark cockpit design system (mint #4ade80 accent, Geist/Geist Mono).

Library (/companies): table of enriched companies with confidence badges and staleness badges, tags, bulk-upload (paste list or CSV), download buttons (CSV/JSON), per-row refresh.
Detail (/companies/{domain}): company profile, labour cost per role, the AI business case, and source attribution per datapoint.
Pricing page (app/(marketing)/pricing): add the "research lookups" and "bulk" columns and the migrated tier prices. No PowerPoint export in v1.

10. Docs, eval, quality

Docs (docs/): API docs for the new /v1/companies endpoints, the enriched-record schema, export formats, bulk job lifecycle, and lookup billing. The reference-data methodology gets its own doc (seed of the open "AI Task Baseline Vocabulary" standard).
Eval: port WF4 as a deterministic, code-based suite against a versioned ground-truth fixture (NL/UK/DE/US mix). Scores headcount/country/wage accuracy and gates overall >=80% in CI. Wage comparison strips the employer factor to compare like for like (ground truth is gross survey wage).
Env: add the OpenRouter key (and any stat-API keys) to apps/web/lib/env.ts Zod schema so missing config fails fast, consistent with existing env validation.
Tests: unit tests for every packages/core/enrichment pure function; integration tests for the API routes and the claim-worker.

11. Build sequence (phases)

Data model + migrations (company_research, org_companies, wages, enrichment_jobs/items, company_lookups, refresh_log) + RLS + seeded reference data.
Enrichment service: provider config (Sonar + Claude via OpenRouter), grounded research
- extraction, deterministic core, cache-first, confidence rollup. Eval suite to >=80%.
Sync API: POST /v1/companies, GET list/detail, refresh, export (CSV/JSON), new api-key scopes, lookup charging + ledger.
Bulk: POST /v1/companies/bulk + pg_cron claim-worker + GET /v1/jobs/{id}.
Pricing/quota: lookup caps in plan-gate (hard cap + 402 upgrade, no overage), pricing page columns. Then the guarded EUR 49/249 migration + grandfathering.
Frontend: library + detail + upload + download + badges.
Docs + methodology doc + SDK touch-ups.

12. Open questions

Exact Sonar model tier (sonar vs sonar-pro) and per-lookup cost ceiling to stay within ~EUR 0.10-0.30 / company.
Which official statistics to call live (Eurostat lc_lci_lev, CBS, ECB FX) vs rely on seeded reference tables, per country, balancing reliability against the >=80% gate.
CSV column schema for the outreach export (which fields downstream campaigns need).
Whether bulk on Pro is capped (e.g. max 10/job) or Agency-only, an open pricing decision carried over from the vault pricing note.