For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Make HumanHours classify every company's workforce into a controlled, international occupation vocabulary (ISCO-08) instead of 14 generic business buckets, show the company's real job titles as sublabels, and price each role worldwide from the best source in priority order: the company's own vacancy salary > an official statistic for that country+occupation (ILOSTAT / Eurostat SES / BLS) > grounded research > a seed floor. Wages are loaded with country-specific employer charges and normalized to the workspace currency.
Why (measured, arboconcern.nl, 2026-06-05): A 53-person occupational-health service stores other 22 @ EUR 29.82/h for its medical specialists, all from generic Eurostat NL, while its own vacancies state bedrijfsarts EUR 9,500-12,750/month (~EUR 55-75/h gross). Roles are wrong and the EUR 2.5M total is a serious underestimate. Root cause: extraction collapses everything onto 14 generic keys before wage lookup, so we ask "what does 'other' earn", never "what does a medical doctor earn".
Why ISCO + worldwide: ISCO-08 is the international occupation standard, so the global wage sources are already keyed on it. Choosing ISCO is what makes worldwide coverage feasible (a free-text or NL-only taxonomy would not). HumanHours must work for any country: ISCO is the universal key, and the wage waterfall degrades to grounded research + employer-stated pay anywhere a seed is missing.
Architecture:
- Occupation key: a curated ~50-entry ISCO-08 vocabulary (lean to minor-group granularity where wages diverge, e.g. doctors != nurses != pharmacists). The extraction LLM CLASSIFIES each real role into one ISCO key and keeps the verbatim title; a deterministic keyword mapper is the fallback.
- Wage waterfall (per role, picked by the continuous confidence scorer):
employer_stated(vacancy, 0.92) >official_statisticfor (country, isco) from ILOSTAT/Eurostat/BLS (0.90) > grounded research for the occupation (Sonar, 0.78) > seed floor. Thewagescache keyed on (country, isco) makes repeats instant. - Worldwide correctness: a country employer-loading-factor table (gross -> fully-loaded) and currency normalization through the existing FX util.
Tech Stack: Next.js 16, TypeScript, Supabase (Postgres) EU, OpenRouter (Sonar + Claude), Vitest. Pure logic (ISCO mapping fallback, vacancy parse, salary-to-hourly, role matching, wage-source ranking) lives in @agent-metrics/core, unit-tested without DB or network. Data ingestion lives in packages/db/scripts.
Phasing (each phase ships value):
- ISCO vocabulary + classification + display.
- Employer-stated vacancy wages (biggest accuracy win, works globally today via research fallback).
- Global official wage seed by ISCO (ILOSTAT + Eurostat + BLS).
- Country loading factors + currency normalization.
Blast radius: changes role labels and wages for ALL companies. Verify across diverse industries AND countries in Phase 1/2/4. Existing 426 records only change on re-enrich (TTL or force-refresh).
Phase 1: ISCO occupation vocabulary + classification
Task 1.1: Curated ISCO-08 occupation vocabulary
Files:
-
Create:
packages/core/src/enrichment/isco.ts -
Test:
packages/core/src/enrichment/isco.test.ts -
Modify:
packages/core/src/index.ts -
Step 1: Write the vocabulary. A controlled list of occupation groups, each
{ key, isco, label, examples }. Lean to ISCO minor groups where wages differ. Seed content below; EXTEND to ~50 covering management, professionals, technicians, clerks, service/sales, trades, operators, elementary, plus anotherfallback. Keys are stable slugs (the wage/cache key).
export interface IscoOccupation {
key: string; // stable slug, the wage + cache key
isco: string; // ISCO-08 code (major/sub-major/minor)
label: string; // human label for display
examples: string[]; // titles the LLM/keyword mapper should fold in
}
export const ISCO_OCCUPATIONS: IscoOccupation[] = [
{
key: "chief_executives",
isco: "112",
label: "Executives & senior officials",
examples: ["ceo", "managing director", "directeur", "founder"],
},
{
key: "admin_managers",
isco: "121",
label: "Business & administration managers",
examples: ["operations manager", "finance manager", "office manager"],
},
{
key: "sales_marketing_managers",
isco: "122",
label: "Sales & marketing managers",
examples: ["head of sales", "commercial manager", "marketing director"],
},
{
key: "ict_managers",
isco: "133",
label: "ICT managers",
examples: ["cto", "it manager", "head of engineering"],
},
{
key: "medical_doctors",
isco: "221",
label: "Medical doctors",
examples: [
"bedrijfsarts",
"arboarts",
"company doctor",
"physician",
"occupational physician",
"huisarts",
"specialist",
],
},
{
key: "nursing_midwifery",
isco: "222",
label: "Nursing & midwifery professionals",
examples: ["verpleegkundige", "nurse", "praktijkondersteuner"],
},
{
key: "other_health_professionals",
isco: "226",
label: "Other health professionals",
examples: [
"psycholoog",
"psychologist",
"fysiotherapeut",
"physiotherapist",
"arbeidsdeskundige",
"diëtist",
],
},
{
key: "health_associates",
isco: "32",
label: "Health associate professionals",
examples: ["case manager verzuim", "doktersassistent", "paramedic"],
},
{
key: "software_developers",
isco: "2512",
label: "Software developers",
examples: ["developer", "software engineer", "programmer", "backend", "frontend"],
},
{
key: "ict_professionals",
isco: "25",
label: "ICT professionals",
examples: ["devops", "data engineer", "system administrator", "qa engineer"],
},
{
key: "data_scientists",
isco: "212",
label: "Data & analytics professionals",
examples: ["data scientist", "analyst", "bi", "statistician"],
},
{
key: "engineering_professionals",
isco: "214",
label: "Engineering professionals",
examples: ["mechanical engineer", "electrical engineer", "civil engineer"],
},
{
key: "legal_professionals",
isco: "261",
label: "Legal professionals",
examples: ["lawyer", "advocaat", "jurist", "counsel", "compliance officer"],
},
{
key: "accounting_finance",
isco: "241",
label: "Finance professionals",
examples: ["accountant", "controller", "financial analyst", "boekhouder"],
},
{
key: "admin_professionals",
isco: "242",
label: "Administration professionals",
examples: ["consultant", "project manager", "policy advisor", "hr advisor"],
},
{
key: "teaching_professionals",
isco: "23",
label: "Teaching professionals",
examples: ["teacher", "docent", "trainer", "lecturer"],
},
{
key: "sales_marketing_professionals",
isco: "243",
label: "Sales, marketing & PR professionals",
examples: ["marketeer", "account manager", "growth", "content", "communications"],
},
{
key: "business_associates",
isco: "33",
label: "Business & admin associate professionals",
examples: ["sales rep", "bdr", "sdr", "bookkeeper", "buyer", "recruiter"],
},
{
key: "customer_service_clerks",
isco: "422",
label: "Customer service clerks",
examples: ["customer service", "support agent", "klantenservice", "helpdesk", "receptionist"],
},
{
key: "general_clerks",
isco: "41",
label: "General & keyboard clerks",
examples: ["administratief medewerker", "data entry", "secretary", "backoffice"],
},
{
key: "stock_logistics_clerks",
isco: "432",
label: "Material-recording & transport clerks",
examples: ["warehouse", "magazijn", "logistics", "supply chain", "planner"],
},
{
key: "personal_care_workers",
isco: "53",
label: "Personal care workers",
examples: ["verzorgende", "care worker", "thuiszorg"],
},
{
key: "retail_sales_workers",
isco: "52",
label: "Sales workers",
examples: ["verkoper", "shop assistant", "cashier", "winkelmedewerker"],
},
{
key: "drivers",
isco: "83",
label: "Drivers & mobile plant operators",
examples: ["chauffeur", "driver", "courier", "bezorger"],
},
{
key: "machine_operators",
isco: "81",
label: "Stationary plant & machine operators",
examples: ["operator", "productiemedewerker", "machine operator"],
},
{
key: "construction_trades",
isco: "71",
label: "Building & related trades",
examples: ["monteur", "installateur", "electrician", "plumber", "bouwvakker"],
},
{
key: "cleaners_helpers",
isco: "91",
label: "Cleaners & helpers",
examples: ["schoonmaker", "cleaner", "huishoudelijk"],
},
{
key: "labourers",
isco: "93",
label: "Labourers",
examples: ["uitzendkracht", "general worker", "helper"],
},
{ key: "other", isco: "0", label: "Other / mixed", examples: [] },
// EXTEND: add the remaining ~20 groups (hospitality 51x, protective services 54x,
// metal/electrical trades 72x/74x, science professionals 21x, etc.) to reach ~50.
];-
Step 2: Test the vocabulary integrity: keys unique, every entry has a non-empty label and isco code,
otherpresent. Run -> PASS. -
Step 3: Export
ISCO_OCCUPATIONSand the type frompackages/core/src/index.ts.tsc. Commitfeat(core): curated ISCO-08 occupation vocabulary.
Task 1.2: Deterministic ISCO keyword fallback mapper
Files:
-
Modify:
packages/core/src/enrichment/normalize-roles.ts(addmapToIsco; keepnormalizeRolefor the legacy 14-group rollup if still referenced elsewhere) -
Test:
packages/core/src/enrichment/normalize-roles.test.ts -
Step 1: Write failing tests:
mapToIsco("bedrijfsarts")->"medical_doctors",mapToIsco("arbeidsdeskundige")->"other_health_professionals",mapToIsco("software engineer")->"software_developers", an unknown string ->"other". -
Step 2: Implement
mapToIsco(raw): lowercase, then match against eachISCO_OCCUPATIONSentry'sexamples(substring/token), most-specific first; return the matchedkey, else"other". This is the safety net when the LLM classification is missing or invalid. -
Step 3: Run -> PASS.
tsc. Commitfeat(core): keyword fallback mapper to ISCO occupations.
Task 1.3: Extraction classifies into ISCO + keeps the real title
Files:
-
Modify:
apps/web/lib/enrichment/provider.ts(extractCompany) -
Modify:
packages/types/src/enrichment.ts(RoleEstimateSchema) -
Step 1: In
enrichment.ts, changeRoleEstimateSchemato{ role (the ISCO key), title (verbatim real title), headcount, confidence, confidence_score? }.rolebecomes the stable ISCO key;titleis the display sublabel. Keep.strict(). -
Step 2: In
extractCompany, give the LLM the ISCO vocabulary (build a compactkey: label (examples)list fromISCO_OCCUPATIONS) and instruct: for each real workforce role, output{ title: <verbatim role as the company names it>, role: <the single best ISCO key from the list>, headcount }. Headcounts roughly sum to total. Coerce in code: ifroleis not a valid ISCO key, fall back tomapToIsco(title). Keepconfidence: "llm_inferred". -
Step 3:
tsc. Commitfeat(enrichment): classify roles into ISCO occupations, keep real titles.
Task 1.4: Service + display use ISCO key with title sublabel
Files:
-
Modify:
apps/web/lib/enrichment/service.ts(headcountByRolekeys on the ISCOrole; carrytitleto output) -
Modify:
apps/web/app/(app)/companies/[domain]/page.tsx(label = ISCO label, showtitlesublabel) -
Step 1: In
service.ts, keyheadcountByRoleon the ISCOrole(already a stable key, nonormalizeRolecollapse), and keep atitleByRolemap for display.roles.push/wages.pushcarry bothrole(isco) andtitle. -
Step 2: In the detail page, render
iscoLabel(role.role)(a small map fromISCO_OCCUPATIONS) as the main label androle.titleas a muted sublabel.roleLabel(title-case) stays a fallback. -
Step 3:
tsc. Commitfeat(companies): show ISCO occupation group with the real job title.
Phase 2: Employer-stated vacancy wages (works globally today)
Task 2.1: VacancySalary type + pure helpers
Files:
-
Modify:
packages/types/src/enrichment.ts -
Create:
packages/core/src/enrichment/vacancy.ts+vacancy.test.ts -
Modify:
packages/core/src/index.ts -
Step 1: Add
VacancySalarySchema{ role, eur_low: number|null, eur_high: number|null, period: "month"|"year", currency: string (ISO-4217, default "EUR"), source_url? }to types. -
Step 2: Write failing tests for
parseVacancySalaries,loadedHourlyFromSalary,matchVacancyRole(see code below for behavior). -
Step 3: Implement
packages/core/src/enrichment/vacancy.ts:
export interface RawVacancySalary {
role: string;
eur_low: number | null;
eur_high: number | null;
period: "month" | "year";
currency: string;
source_url?: string;
}
export function parseVacancySalaries(rows: unknown): RawVacancySalary[] {
if (!Array.isArray(rows)) return [];
const out: RawVacancySalary[] = [];
for (const r of rows) {
const o = r as Partial<RawVacancySalary>;
if (typeof o?.role !== "string" || !o.role.trim()) continue;
const low = typeof o.eur_low === "number" && o.eur_low > 0 ? o.eur_low : null;
const high = typeof o.eur_high === "number" && o.eur_high > 0 ? o.eur_high : null;
if (low === null && high === null) continue;
out.push({
role: o.role.trim(),
eur_low: low,
eur_high: high,
period: o.period === "year" ? "year" : "month",
currency:
typeof o.currency === "string" && o.currency.length === 3
? o.currency.toUpperCase()
: "EUR",
source_url: typeof o.source_url === "string" ? o.source_url : undefined,
});
}
return out;
}
// Stated gross salary (monthly/yearly, possibly a range) in the row currency ->
// a fully-loaded hourly cost IN THE SAME CURRENCY via the range midpoint. The
// caller applies FX to the workspace currency (Phase 4). Null when unusable.
export function loadedHourlyFromSalary(opts: {
low: number | null;
high: number | null;
period: "month" | "year";
factor: number;
hoursPerYear: number;
}): number | null {
const vals = [opts.low, opts.high].filter((v): v is number => typeof v === "number" && v > 0);
if (vals.length === 0) return null;
const mid = vals.reduce((a, b) => a + b, 0) / vals.length;
const annualGross = opts.period === "month" ? mid * 12 : mid;
return Math.round((annualGross / opts.hoursPerYear) * opts.factor * 100) / 100;
}
const norm = (s: string) =>
s
.toLowerCase()
.replace(/[^a-z0-9\s]/g, " ")
.split(/\s+/)
.filter(Boolean);
// Best matching ISCO role key for a mined vacancy title, matching the vacancy
// title against each role's verbatim TITLE (passed as [key, title] pairs).
export function matchVacancyRole(
vacancyRole: string,
roles: { key: string; title: string }[],
): string | null {
const v = new Set(norm(vacancyRole));
if (v.size === 0) return null;
let best: { key: string; score: number } | null = null;
for (const { key, title } of roles) {
const r = norm(title);
if (r.length === 0) continue;
const overlap = r.filter((t) => v.has(t)).length;
if (overlap === 0) continue;
const score = overlap / Math.max(v.size, r.length);
if (!best || score > best.score) best = { key, score };
}
return best && best.score >= 0.5 ? best.key : null;
}- Step 4: Export from core index. Run tests -> PASS.
tsc. Commitfeat(core): vacancy parse, salary-to-hourly, role matching.
Task 2.2: Vacancy research provider
Files:
-
Create:
apps/web/lib/enrichment/vacancy-provider.ts -
Step 1:
researchVacancies(domain, opts)(Sonar) +extractVacancySalaries(brief, opts)(Claude -> JSON array ->parseVacancySalaries). Prompts proven inpackages/db/scripts/deep-research-spike.ts: find the company's own careers page + job boards, extract role title, salary range with currency + period, and the vacancy URL; report only stated pay. Resilient: returns[]on any failure. -
Step 2:
tsc. Commitfeat(enrichment): vacancy research provider.
Task 2.3: employer_stated tier + wage-source ranking
Files:
-
Modify:
packages/types/src/enrichment.ts(ConfidenceTierSchemaaddemployer_stated) -
Modify:
packages/core/src/enrichment/confidence.ts(TIER_PRIOR.employer_stated = 0.92) -
Modify:
packages/core/src/enrichment/confidence.test.ts -
Create:
packages/core/src/enrichment/wage-source.ts+ test -
Step 1: Add
employer_statedto the tier enum +TIER_PRIOR(0.92, aboveofficial_statistic0.9). Extend the confidence test. -
Step 2: Implement + test a pure picker that selects the best candidate wage per role by the continuous scorer:
import { datapointConfidence } from "./confidence";
import type { ConfidenceTier } from "@agent-metrics/types";
export interface WageCandidate {
loadedHourlyEur: number;
tier: ConfidenceTier;
sourceUrl: string | null;
matchQuality?: number;
staleYears?: number;
isFallback?: boolean;
}
// Pick the highest-confidence candidate. employer_stated (0.92) x exact match
// beats official_statistic (0.9) beats grounded research (0.78) beats floor.
export function pickWageCandidate(cands: WageCandidate[]): WageCandidate | null {
let best: { c: WageCandidate; score: number } | null = null;
for (const c of cands) {
const score = datapointConfidence({
tier: c.tier,
matchQuality: c.matchQuality,
staleYears: c.staleYears,
isFallback: c.isFallback,
});
if (!best || score > best.score) best = { c, score };
}
return best?.c ?? null;
}- Step 3: Run tests -> PASS.
tsc. Commitfeat(core): employer_stated tier + wage-source picker.
Task 2.4: Wire vacancies into resolution
Files:
-
Modify:
apps/web/lib/enrichment/service.ts -
Step 1: In
enrichCompany, run vacancy research (resilient, parallel where possible). Build a per-ISCO-key vacancyWageCandidateviamatchVacancyRole+loadedHourlyFromSalary(factor + hoursPerYear), tieremployer_stated, matchQuality 1, sourceUrl = vacancy URL. -
Step 2: In the per-role loop, assemble all candidates for the role (vacancy if present, plus the existing seed/reference/research/fallback result) and choose with
pickWageCandidate. Thread the chosen tier + sourceUrl + score intowagesoutput. -
Step 3:
tsc. Commitfeat(enrichment): employer-stated vacancy wages win the resolution.
Phase 3: Global official wage seed by ISCO
Task 3.1: wage_reference keyed on ISCO + sources
Files:
-
Create:
packages/db/migrations/0054_wage_reference_isco.sql -
Step 1: Add
isco_key text(and index on(country, isco_key, year)) towage_reference, plus keepsource/source_url/version. New rows key onisco_key; the legacy 14-role rows stay readable but are no longer matched once extraction emits ISCO keys. Document that the wage resolver now looks up byisco_key. -
Step 2: Apply via the session pooler (owner-approved prod migration). Commit
feat(db): wage_reference keyed on ISCO occupation.
Task 3.2: ILOSTAT global ingestion (the worldwide base)
Files:
-
Create:
packages/db/scripts/ingest-ilostat-wages.ts -
Step 1: Pull ILOSTAT mean nominal earnings by occupation (ISCO-08) and country (indicator family
EAR_*OCU*, via the ILOSTAT bulk/SDMX API at ilostat.ilo.org). Map ISCO codes onto the curatedISCO_OCCUPATIONSkeys (a code->key crosswalk in core), convert monthly/period earnings to gross hourly (using a default annual hours per country, refined in Phase 4), normalize currency to EUR via FX for the storedgross_hourly_eur(keep the source currency + original value in the row). Tier source labelILOSTAT <year>. -
Step 2: Upsert into
wage_reference(country, isco_key, year, source). Idempotent. Run for a broad country set. Commitfeat(db): ingest ILOSTAT wages by ISCO (global base).
Task 3.3: Eurostat SES (EU detail) + BLS OEWS (US)
Files:
-
Create:
packages/db/scripts/ingest-eurostat-ses.ts,packages/db/scripts/ingest-bls-oews.ts -
Step 1: Eurostat SES (
earn_ses*hourly earnings by ISCO occupation) for EU members ->wage_reference(finer than ILOSTAT for the EU). Source labelEurostat SES <year>. -
Step 2: BLS OEWS (US, by SOC) with a SOC->ISCO crosswalk ->
wage_referencefor US. Source labelBLS OEWS <year>. -
Step 3: Both idempotent, both map onto the curated keys. The wage resolver already prefers the most specific official row by
pickWageCandidate(official_statistic tier); when multiple official rows exist for a (country, isco), prefer the freshest (loweststaleYears). Commitfeat(db): ingest Eurostat SES + BLS OEWS by ISCO.
Task 3.4: Resolver reads ISCO official rows
Files:
-
Modify:
apps/web/lib/enrichment/service.ts(resolveLoadedHourlyBatch) -
Step 1: Change the reference lookups to query
wage_referencebyisco_key(the role keys are now ISCO keys), country first then a DEFAULT fallback, producingofficial_statisticcandidates withstaleYearsfrom the row year andmatchQuality1 (exact country) or 0.7 (DEFAULT). Real occupations missing from the seed fall through to grounded research (already works) then floor. -
Step 2:
tsc. Commitfeat(enrichment): resolve official wages by ISCO worldwide.
Phase 4: Country loading factors + currency
Task 4.1: Country employer-loading factors
Files:
-
Create:
packages/db/migrations/0055_employer_factors_seed.sql(extend the existingemployer_factorstable) -
Create/Modify: ingestion or seed of factors per country (OECD/ILO non-wage labour cost ratios)
-
Step 1: Seed
employer_factors(country, factor, hours_per_year) for a broad country set: gross-to-loaded multipliers (e.g. NL ~1.30, FR ~1.45, DE ~1.28, UK ~1.15, US ~1.20) plus realistic annual hours per country. Default factor stays 1.3 only for unknown countries. Source the ratios from OECD/Eurostat labour-cost data; store the source. -
Step 2: Apply (owner-approved). Verify
resolveLoadedHourlyBatchalready readsemployer_factorsby country (it does). Commitfeat(db): country employer-loading factors.
Task 4.2: Currency normalization of wages
Files:
-
Modify:
apps/web/lib/enrichment/service.ts, ingestion scripts,vacancy-provider.ts -
Step 1: Ensure every wage path normalizes the source-currency figure to a common base via the existing FX util before computing
loadedHourlyEur: vacancy salaries (use the rowcurrency), ILOSTAT/BLS rows (source currency), and grounded research (already asked in EUR). Store the original currency + value in provenance. -
Step 2: Confirm the displayed
blended_hourly_euris in the base and the UI applies the workspacefx(it already does for display).tsc. Commitfeat(enrichment): normalize worldwide wages through FX.
Task 4.3: End-to-end worldwide verification
Files: none (verification)
- Step 1: Run the full path on
arboconcern.nl: roles includemedical_doctors(title "bedrijfsarts") notother; the doctor wage resolves to ~EUR 55-86/h viaemployer_statedwith the vacancy URL; total corrects materially upward. - Step 2: Run on a diverse worldwide set: a US SaaS company (USD, BLS), a German Mittelstand (Eurostat), a non-EU/non-US company (ILOSTAT or research fallback). Confirm sane roles, correct currency, country-appropriate loading.
- Step 3: Force-refresh the real tenants (owner-approved, admin org 10k cap) so the dashboard shows corrected data.
Out of scope (note, don't do)
- All 436 ISCO unit groups. We curate ~50 wage-distinct groups; the LLM classifies into them, grounded research covers anything finer.
- Per-individual salary scraping. We use companies' own published vacancy ranges (aggregate, stated) only.
- Blending multiple wage sources per role. We PICK the single highest-confidence source (auditable provenance), never a blend.
- Real-time wage refresh. Seeds are periodic (annual stats); the
wagescache + TTL handle freshness.
Self-Review
- Fixes the measured problem and scales worldwide: ISCO classification (1.x) ends the "other 22" collapse; employer-stated vacancies (2.x) price the bedrijfsarts from the company's own posting; the ISCO-keyed global seed (3.x: ILOSTAT base + Eurostat + BLS) plus grounded-research fallback makes wages official where possible and present everywhere; loading factors + FX (4.x) make the numbers correct in any country and currency.
- Type consistency:
IscoOccupation/ISCO_OCCUPATIONS,RoleEstimate{role: iscoKey, title},VacancySalary,employer_statedtier,WageCandidateare each declared once and imported where used. The wage key is the ISCOrolethroughout (extraction, resolver, cache, seed). - No placeholders: pure helpers (ISCO mapper, vacancy parse/convert/match, wage picker) are given in full with tests; the vocabulary ships a real curated seed with an explicit extend-to-~50 instruction; ingestion tasks name the real datasets (ILOSTAT EAR_OCU, Eurostat earn_ses, BLS OEWS + SOC->ISCO crosswalk).
- Honest scope: official-stat confidence holds only where a seed row exists; elsewhere wages read as grounded research (0.78) or employer-stated (0.92), which is truthful. Phase 3 is the heaviest (data ingestion); Phases 1-2 already deliver correct roles + employer wages worldwide via the research fallback before any seed lands.