Bob Finance v2 — Master Guide

01The bet — four moats.

Why this wins the category. Not because we're faster or prettier — because we picked four constraints the rest can't easily copy.

MOAT 01

Deterministic, not probabilistic.

The agent doesn't write numbers. It assembles typed building blocks — dimensions, inputs, assumptions, metrics, scenarios — and the engine evaluates them. Outcomes are guaranteed and auditable. Every assumption cites a benchmark source. Every metric carries an expression. Every number traces to a connector value or a catalog default. No hallucinations. No "the agent thinks revenue is around $4M." If the catalog doesn't have it, the plan card surfaces a PENDING confirm — not a guess.

MOAT 02

Agent when you want it. Pure-play FP&A when you don't.

The chat is an accelerator, not a gate. The HC planner, IS, BS, and Workspace are full standalone surfaces. Drag a row, edit a cell, branch a scenario, run a report — without ever opening the agent. Open the agent when you want to scaffold faster, ask why a number is what it is, or model a what-if. Both modes share the same engine — agent edits and manual edits are equivalent operations on the typed entities underneath.

MOAT 03

First FP&A built native on HRIS.

People are usually 60–70% of the cost base. Every other FP&A tool gets HR data via flaky integrations, syncs, and CSV uploads. We're already inside Bob. Hiring funnel, workforce planning, comp changes, termination calendar, equity grants, country-specific statutory burden — all of it is here, native, real-time. Forecast people costs perfectly because the data isn't pulled — it's the same database.

MOAT 04

Rollforward as a feature, not a ritual.

Quarter close is the most painful FP&A workflow — and most platforms push it back to spreadsheets and email threads. We build it in. 5-step rollforward wizard with sheet-aware stale-flag detection: snapshot the baseline, pull actuals from connectors, branch the new period, generate the variance report, attach commentary. Variance and audit fall out of the typed-entity architecture for free. Close becomes an engine operation, not a two-week ritual.

02Architecture in one breath

Stateless engine, agent loop, and persistent store — three layers, single source of truth.

UI Surface→Agent Loop→Tool Layer→Engine + HF→Catalogs + Connectors

FULL DIAGRAM

How the agent stays on rails.

A multi-agent system that proposes, builds, and explains — wrapped around a deterministic engine. Probabilistic LLM, deterministic engine, separated by design.

Open ↗

ENGINE

Pure data structures

Models · Components · Dimensions · Inputs · Assumptions · Metrics · Scenarios. Zero UI, zero I/O.

src/engine/engine.ts

STORE

Zustand + persist

Wraps the engine, exposes typed actions, drives React via selectors. Persists Maps to localStorage.

src/store/store.ts

AGENT

Planner / Builder / Analyst

Anthropic Messages API runner with plan-first checkpoint enforcement and multi-role handoff.

src/agent/loop.ts

EVALUATOR

HyperFormula at compute

Per-evaluation calculator inside ExpressionEvaluator. Metric DAG sits above; HF gets one scalar problem at a time.

src/engine/evaluator.ts

CATALOGS

Eight vocabularies

Templates · Components · Dimensions · Connectors · Inputs · Assumptions · Metrics · Scenarios. The vocabulary the agent draws from when scaffolding.

src/agent/catalogs/

SEED

Real-shape data

148 HiBob employees · 38 Salesforce customers · 24 months of QuickBooks GL. The credibility hook — "real data, no mockups."

src/data/seed-*.json

03The building blocks

Five typed primitives. The agent doesn't write code — it composes these. Every model in the system is built from them. That's how we eliminate hallucination.

Dimension DimNode[]

Coordinate axes — employee · customer · account · period · region · subsidiary · department · segment · productLine. A hierarchy tree the engine walks for projections.

Input number[] | Map

Time-series data from connectors or manual entry. Salaries, ARR, headcount actuals. Live-polling capable.

Assumption scalar | Map

Editable parameters. Carries benchmark sources (IRS · HMRC · Pave · Bessemer · SHRM), caps/floors, scope (statutory vs company-policy).

Metric expression

Formula-derived values. Topo-sorted by dependency; HyperFormula evaluates per period × per dim coordinate.

Scenario override Map

A delta-map of assumption overrides. Recession ≠ a copy of Baseline — it's a 1-row override.

DIMENSIONALITY

The compounding power lives in dim intersections.

Group revenue by customer × productLine. COGS by productLine only. OpEx by department × location. One config per line. The engine resolves allocation rules per dim source — HC employee leaves project burden by loaded-comp share; Workspace customer leaves project revenue by ARR share. Per-line affinity is strict in v1: unreachable dims render flat with a warning, never silently misallocated.

05The agent system

Three roles, one trust contract: every plan is reviewed before any mutation lands.

PLANNER

Proposes

Reads the user's intent + active model context. Drafts a PlanProposal — assumptions, metrics, user columns, dim setup. Cannot mutate. Hands off to Builder on approval.

BUILDER

Executes

Runs the approved plan via tools (createModel, addAssumption, addMetric, addUserColumn, setLineGrouping). Plan-first checkpoint enforcement at the engine layer (PR15).

ANALYST

Narrates

Reads the resulting state. Answers "why is this number what it is" with provenance. The "explain this number" role.

06End-to-end FP&A.

Forecast people, expenses, revenue, balance sheet, scenarios, close — one product, one engine, one source of truth. The agent helps when you ask. Standalone surfaces work without it.

👥

PEOPLE COSTS · YOUR BIGGEST EXPENSE

Forecast comp to the employee.

HRIS-native means we already have the roster.

Pull every employee straight from HiBob — hires, terms, promotions, comp changes, FX, country-specific tax / benefits / equity rates. Layer the catalog's statutory rates (Social Security, Medicare, FUTA, payroll tax by country) and your company-policy levers (401k match, equity, bonus targets) on top. Fully-loaded comp lands per employee, per month, in the right currency.

148employees

5countries

5currencies

📊

P&L · GL-GROUNDED

Build the income statement.

QuickBooks transactions roll up by chart of accounts.

59 GL accounts, 24 months of actuals, auto-populated. Salaries flow from the HC roster. Revenue flows from the Workspace. Group revenue by customer × productLine, COGS by productLine, OpEx by department × location — one config per line. Subtotal formulas drive engine math; edit OperatingIncome in the Library, the cascade propagates.

59GL accounts

27,603transactions

24moactuals

🔀

SCENARIOS · BRANCH AND DIFF

Model what-ifs in milliseconds.

Recession isn't a copy of Baseline. It's a delta.

Branch from any scenario, override one assumption (or twenty), and watch every dependent metric ripple. Compare two scenarios cell-by-cell — variance, % change, dimension-aware. Stored as Map<assumptionId, value>; no model duplication, no drift.

O(1)branch

0copies

💰

REVENUE · WORKSPACE-DRIVEN

Revenue, ARR, pipeline, custom KPIs.

Salesforce data + your own modeling on top.

38 customers, 124 opportunities, 3 segments — pulled from Salesforce with their attrs (segment, productLine). Build ARR forecasts by customer or segment, capacity models, win-rate funnels, custom KPIs. Multi-tab Workspace: time / dimension / custom column axes. Cross-sheet references resolve to the canonical engine.

$28.8MARR base

$19.6Mopen pipeline

📅

QUARTER CLOSE · 5 STEPS

Run the close, audit the variance.

Snapshot, pull actuals, branch, report, file.

Rollforward wizard walks you through the boring stuff that makes the numbers trustworthy. Snapshot baseline before close, pull actuals from connectors, branch the new period, generate variance report, attach commentary. Stale-flag detection per sheet — if anything got pushed to the wrong place, you know.

5steps

autovariance

fullaudit trail

🤖

THE AGENT · WHEN YOU WANT IT

Three roles, one trust contract.

Planner proposes, Builder executes, Analyst narrates.

Ask anything: "forecast comp", "build a 2026 hiring plan", "why is OpEx $4M over budget". The Planner drafts a typed plan card. You approve. The Builder runs the approved plan via tools. The Analyst answers with provenance. The agent can't mutate without your approval — enforced at the engine, not the prompt.

3roles

0un-planned mutations

07v1 → v2 — the leap.

v1 was built before AI agents existed and before composability was a category requirement. v2 isn't a refactor — it's a different shape entirely. Same workflow, fundamentally different architecture.

None. The product predates the agent era. Every number was manually entered or directly calculated.

First-class agent loop. Planner proposes typed plans, Builder executes against the engine, Analyst narrates with provenance. Plan-first enforced at the engine — the agent literally cannot mutate without your approval.

Performance

Slow. Recomputes were monolithic and full-model; large workbooks dragged.

Topo-sorted metric DAG + HyperFormula. Only dependents recompute. Scenarios branch in O(1) via override Map — no model duplication. Sub-second feel even on large planners.

Worksheets

No worksheets. Users had to leave the product and switch to Excel for any non-canonical model.

Workspace component. Multi-tab, time / dimension / custom column axes. ARR forecasts, capacity models, win-rate funnels — built inside the product. Blank-grid spreadsheets with Excel-syntax formulas (in flight).

Assumptions

No assumptions as a first-class concept. Drivers were buried in formulas; changing one meant editing every formula it touched.

Typed Assumption primitive. Catalog-cited values (IRS · HMRC · Pave · Bessemer · SHRM), statutory vs company-policy scope, dim-keyed by country / segment / band, caps and floors. Edit once, every dependent metric moves.

Child row references

Couldn't reference child rows. A formula at the parent level had no way to see or react to the rows beneath it.

Cross-component drivers (crossComponentRef). HC LoadedComp flows into IS Salaries; ARR per customer flows into Revenue; child rows project into parent rollups via dim-leaf attribute walks. Hire one person — every dependent number ripples.

Dimensions

Weren't strong. Limited slicing; couldn't compose intersections; rollups were single-axis at best.

Dimensions are first-class primitives. Hierarchy trees, attribute projection, per-line cartesian intersections (customer × productLine, department × location). Allocation rules per dim source. One config per IS line — Revenue and COGS can group differently.

Traceability

None. Numbers floated free of their formulas; "where did this come from" was a tribal-knowledge question.

Every number traces back. Metrics carry an expression. Assumptions carry a benchmark source. Inputs carry a connector source. Click any cell — provenance walks back through the typed graph to its origin.

Auditability

None. No audit trail of who changed what, no enforcement of review before mutation, no provenance per number on export.

Audit-ready by construction. Plan-first checkpoint enforced at the engine throws on un-planned mutations. Every assumption + metric + scenario carries a citation chain. Provenance per number, not per export.

Honest framing: v2 is a working prototype today, not a shipped product. The architecture is verified end-to-end through 724 passing tests, real-shape connector data, and an autonomous battle-test of the agent loop. Building it out — to multi-tenant production, real OAuth integrations, customer-grade SLAs — is the next investment. The bet here is that the architecture is right, the moats are defensible, and the leap from v1 is large enough to justify the build.

08The stack

Modern, fast, built for AI-native composition.

FRONTEND

Vite 5 · React 18 · TS 5

Strict typescript, vite-fast HMR, no SSR.

STATE

Zustand + persist

Map-aware serializers, 4-field partialize, multi-version migrations.

FORMULA

HyperFormula 3.0

~400 Excel-compatible functions. Used as compute-time evaluator only.

AGENT

Anthropic SDK

Messages API + tool-use stepping. scriptedClient stub for tests.

TESTING

Vitest · 724 passing

62 test files across store · engine · agent · ui. Property-based + regression-freeze + topo predicates.

DEPLOY

Live deck on GH Pages

fy26-budget-deck.html. Live at josephgarafalo-byte.github.io/fy26-budget-deck.

Bob Finance v2.

01The bet — four moats.

Deterministic, not probabilistic.

Agent when you want it. Pure-play FP&A when you don't.

First FP&A built native on HRIS.

Rollforward as a feature, not a ritual.

02Architecture in one breath

Pure data structures

Zustand + persist

Planner / Builder / Analyst

HyperFormula at compute

Eight vocabularies

Real-shape data

03The building blocks

The compounding power lives in dim intersections.

04The 4 component types

05The agent system

Proposes

Executes

Narrates

06End-to-end FP&A.

Forecast comp to the employee.

Build the income statement.

Model what-ifs in milliseconds.

Revenue, ARR, pipeline, custom KPIs.

Run the close, audit the variance.

Three roles, one trust contract.

07v1 → v2 — the leap.

08The stack

Vite 5 · React 18 · TS 5

Zustand + persist

HyperFormula 3.0

Anthropic SDK

Vitest · 724 passing

Live deck on GH Pages

09The decks.