# Part 2 — Billing engine architecture for tree-shaped products Second half of the session: 1:1 resource→line-item model breaking under product trees, the Charge model, four-stage rating pipeline, rate-context cascade, derived charges, Burstable Internet worked example, build-vs-buy analysis, and a one-month tactical plan. --- ## User So antother things... until now we have had a system that loooks like this: we process audit entries to create usage recrods which have lifecycle (ordered, created, connected, etc...). Each usage record represents a resource type. Some attrs of that resource can be used to lookup a SKU which maps to prices over time. The nice thing was we had 1:1 resource to usage record and usage record to billable thing/line item on an invoice. It was all simple. Even stuff like port groups which had children were just broken into individual ports and billed as a normal port. Links and Connections, our complex types which have A and Z sides (VXCs) even stay simple because the A and Z are references so everything just gets billed on its own. I created an engine for generic pricing modifications on these usage records. Generic matcher and generic action applied to price. Multiple types of price modifications, promo codes, reseller discounts, etc... each pricing mod type could be dynamic or "grandfathered". If it's grandfathered, the mods active at order time never expire and price never changes (we always grandfather prices). If a price mod is dynamic (like reseller discounts, if they hit new tiers their discounts go up or down on existing stuff) I used clever algorithms to split the timeframe of the input into discrete chunks with different mods applied/prices. These then layer and the rules about layering the two mod types are handled properly so it can be done in any order with as many mod types as we want to exist (admin overrides, custom plans from resellers, etc...). So at the end of the day, the input would be a usage record and some time frame and you'd get back a bunch of time segments with various mods applied for each segment resulting in different prices. Cool. Math works out, everything was great. Now the 1:1 broke. We have complex products which end up being product trees. I have modeled the trees right and orchestration algos are going well. The pricing is proving tough. Now a tree with N nodes might result in N lines (surcharges, prices not belonging to any particular thing). I have protocols for all the node types... whether it's a resource (tracked with a lifecycle from audit logs), Locatable (has a location), Priceable (has args which map to a SKU which prices can be fetched for), etc... The question is: how do I make this work or think of it generically? Our old tabular reports won't really capture nesting, not a buge deal. What happens when mods apply to different parts of the thing? How do you split? You could have a ton of full copies of the full tree each with different prices? And some things can now come and go. Before, once something was created, it maintained the same shape. Now, nodes can come and go. Technically a port group could have ports added and removed but because we priced them by treating everything as its own thing, but now, we can have a Link which has an Internet Z side which has one or more Network spaces. Those network spaces have various sizes and therefore are priced differently and based on how long they are alive/billable. And everything can be modified. How to I handle all of this sanely? What do people do? ## Assistant This is the fundamental telco/cloud billing problem and the industry has converged on a single answer: **stop trying to map structure to charges. Treat them as independent layers.** Your old 1:1 model worked because the resource graph happened to be flat. The graph went non-flat, and you tried to keep the 1:1 — that's the source of the pain. The fix isn't to invent a clever tree-to-charges mapping. It's to admit they're decoupled. ## The reframe You currently think: ``` Resource → Usage Record → Line Item ``` You need to think: ``` Resource Graph (structure) ──┐ ├─→ Charges (flat stream, attributed) → Invoice (presentation) Pricing Rules + Mods ────────┘ ``` Three layers. Each has its own time dimension. None of them know about each other's shape. - **Resource layer**: trees, nodes, lifecycle, attributes. Source of truth for *what exists when*. - **Rating layer**: takes (resource state at time T, pricing rules, mods) and emits a flat stream of charges. Pure function of the inputs. - **Presentation layer**: takes the charge stream and groups it however the recipient wants — by tree, by SKU, by cost center, flat. Independent of how the charges were rated. The tree shape becomes **metadata on each charge**, not the structure that contains charges. A port deep in a Link tree emits its own charges; those charges carry a `source_path: ["link/123", "z_side", "internet", "network_space/4"]` so you can group by tree position later, but the charges themselves live in a flat list. This is how AWS Cost and Usage Reports work. It's how every TM Forum-aligned telco BSS works. It's how Metronome and Orb model usage-based SaaS billing. The flat charge stream with rich attribution is the shape that scales. ## Charges are first-class The data model that unblocks everything: ```elixir %Charge{ id: ..., amount: %Money{...}, # signed; discounts are negative period: %Interval{from: ..., to: ...}, # the time slice this charge covers sku: "PORT_1G_MONTHLY", # what it represents source: %{ # where it came from org_id: "...", tree_id: "link/123", node_path: ["z_side", "internet", "network_space/4"], node_type: :network_space, resource_id: "ns_abc", # back-reference to the lifecycle }, basis: %{ # what generated it type: :recurring | :one_time | :usage | :surcharge | :discount, base_price: %Money{...}, quantity: 1, rate_id: "rate_42", # which pricing version }, mods_applied: [ # full attribution chain %{type: :reseller_discount, id: ..., effect: -5_00}, %{type: :promo_code, id: ..., effect: -2_00}, ], reason: :recurring | :proration | :adjustment | :true_up, } ``` Key properties: - **Charges don't nest.** Even surcharges that "apply to the whole tree" are just additional charges with `source.node_path = []` (root). A discount on one node is a separate negative charge. - **`>N lines from N nodes` is no longer a paradox.** A node that creates a port + install fee + monthly recurring emits three charges. A structural-only node (a holding container) emits zero. The ratio falls out naturally. - **` boolean()`. You can also store this as a structured form for indexing — bucket mods by org/node_type so rating doesn't check every mod against every charge. At your scale (17k products × N mods) this matters. ## Tree-aware presentation, not tree-aware rating The reason you initially want to "split the tree by mod" is for the report. Don't do that. The report wants to *show* charges grouped by tree node. The rater doesn't care. Build presentation as a transformation over the charge stream: ```elixir charges |> Enum.filter(...) # period, customer, etc. |> Enum.group_by(&{&1.source.tree_id, &1.source.node_path}) |> render_as_tree(tree_snapshot) ``` Or for a flat invoice: ```elixir charges |> Enum.group_by(& &1.sku) |> Enum.map(&aggregate/1) ``` Or for a customer-facing simplified view: ```elixir charges |> Enum.group_by(&{&1.source.tree_id, &1.basis.type}) |> roll_up_with_subtotals() ``` Same data, three views, no re-rating. This is also what makes the "tabular report won't capture nesting" worry go away — *some* views capture nesting, others don't, and they all derive from the same flat charge data. ## The "everything can be modified" piece You already have audit-log-driven lifecycle. Extend the same pattern to: - **Tree topology**: each parent-child edge has effective_from/effective_to. Re-parenting is delete + insert with same node id but a new tree path. - **Node attributes**: same. Audit log tells you when the size of a network space changed; rating walks the timeline. - **Pricing rules**: rate cards have effective dates. Always. - **Mods**: have effective dates. Already do. Grandfathered ones snapshot the rule at order time so external rate-card changes don't affect them. Unified principle: **every input to rating is effective-dated, rating integrates over time.** No exceptions, no special cases. This is the only way to keep the math from getting hairy. ## Idempotency and re-rating Once charges are first-class records: - Rating is **deterministic** given (tree state history, rules, mods) within a period. - Re-rate is safe: regenerate charges for a period and diff against stored charges. Issue adjustment line items for any deltas. - Disputes/corrections get their own charges with `reason: :adjustment` and a back-reference to the original. Never mutate historical charges. This is the property that lets you ship without fear. Customer disputes a bill from 4 months ago? Re-run rating against the corrected inputs, generate the delta as new charges in the current period. Audit trail intact, no historical data mutated. ## What people actually do (industry pointers) If you want to read what others have converged on: - **TM Forum SID** (Shared Information/Data Model). Telco standard. Look specifically at the Product/Service/Resource separation, and the Customer Bill Inquiry / Charge / Adjustment entity model. Heavy, but the entity decomposition is what you want. - **Stripe Billing**: read their docs on Subscriptions + Subscription Items + Invoice Items + Credit Notes. Particularly the way an Invoice Item is just a queued line that lands on the next invoice — independent from the subscription that "owns" it. - **Metronome / Orb engineering blogs**: usage-event-based billing. Both have written publicly about how they decouple events from invoices. Closest to your model. - **AWS Cost and Usage Report (CUR) schema**: 100+ columns of attribution per line. Demonstrates how far you can push "flat charges with rich tags." - **Open source**: Killbill (Java) is the open-source reference billing engine. Worth skimming the data model — particularly how they handle subscription bundles, catalogs with effective dates, and credit/adjustment flows. Don't adopt it; just look at the schema. - **Book**: "Patterns of Enterprise Application Architecture" has the canonical Effective Date / Temporal Object patterns you're already informally using. Worth confirming the vocabulary. ## Sketch of the implementation shift You don't need to rewrite the rating engine. You shift it down a level: ```elixir defmodule Rating do @doc """ Rate one tree for a billing period. Returns a flat charge list. """ def rate(tree_id, %Interval{} = period, ctx) do events = Tree.events(tree_id, period) boundaries = derive_boundaries(events, ctx.rate_versions, ctx.mod_versions) boundaries |> Enum.chunk_every(2, 1, :discard) |> Enum.flat_map(fn [t_start, t_end] -> slice = %Interval{from: t_start, to: t_end} snapshot = Tree.snapshot(tree_id, t_start) snapshot |> billable_nodes() |> Enum.flat_map(&node_charges(&1, slice, ctx)) end) end defp node_charges(node, slice, ctx) do # Each Priceable node emits its base charges. base = Priceable.charges(node, slice, ctx) # Apply your existing mod-layering algorithm per charge. Enum.flat_map(base, &apply_mods(&1, slice, ctx)) end end ``` The `Priceable` protocol you already have — perfect. Each node type knows what charges it generates. Surcharges become a separate `surcharges_for(tree_snapshot, slice, ctx)` pass that emits charges attributed to the root or a parent node. Non-billable nodes simply don't implement `Priceable` (or implement it returning `[]`). Your existing layering algorithm runs in `apply_mods/3`. Same algorithm, just per-charge. ## What I'd actually do first In order: 1. **Define `%Charge{}`** with the attribution fields above. Get the team aligned that this is the unit of billing. 2. **Migrate one Priceable type** (the simplest) end-to-end: tree state → charges → invoice. Prove the model. 3. **Move your mod-layering to operate on charges.** Should be a near-mechanical port. 4. **Add tree-event boundary slicing** for membership changes. This is the new algorithmic piece. 5. **Build presentation transformations** for at least two views (tree-grouped, flat-by-SKU). Validates that the model supports both. 6. **Then** tackle complex types (Links/VXCs, Internet Z-side network spaces). They'll fall into place if the model is right. Don't try to design the whole new system before you ship a slice. The tree-with-network-spaces case is the hardest one — leave it for after you've proven the model on simpler cases. ## The mental shift, in one line > **Charges are events emitted by the rating function over (resource state × rules × mods × time). They live in a flat stream with rich attribution. Trees, mods, and presentations are different views over the same charge stream.** Once that clicks, every specific worry you raised resolves to "compute it from the charge stream." ## User Okay so we have a new product, burstable internet. So you've got a parent burstable internet node with a committed speed and burst speed attrs. Then you've got an Internet at the burst speed (with it's whole subtree) and a Link or Connection at the burst speed (VXC, A and Z side references) with whatever A side and the internet on the Z side). So right off the bat there's a surcharge of some percent of the total cost, and then there's a rule where when you look at a billing cycle, if more than 5% of the time was spent above the committed rate, it gets charged at the burst rate. So there's relationships with the pricing (it's like a function on the tree) and some stuff has to be resolved at billing time. How does that work in the above model, along with pricing mods and changing stuff.. ## Assistant This is the case that exposes whether your model is right. Burstable adds two new mechanics on top of what we discussed — and both have clean answers, but they require admitting that **rating is a multi-pass pipeline, not a single function**. Here's how it lands in the model. ## Two new mechanics **1. Rate context cascades down the tree.** The "what speed/SKU am I priced at" question for the Internet and Link/VXC children depends on a decision made by their parent (`BurstableInternet`). That decision is a function of *usage*, not just topology. The children can't price themselves in isolation — they need context from ancestors. This is the same shape as CSS cascade, OpenTelemetry trace context, or React render context: a value computed at one level flows down and parameterizes computation at lower levels. **2. Some charges depend on other charges.** The surcharge ("X% of total cost") is a derived charge — its amount is a function of charges already produced. You can't emit it in the same pass as base charges because total-cost isn't known yet. It needs a second pass after the leaves have been priced. These are the only two new abstractions you need. Together they make the burstable case routine. ## Rating becomes a four-stage pipeline ``` Stage 1: Slice the period (boundaries from events, mods, rate versions) Stage 2: Per slice — rate base charges (top-down context, bottom-up emission) Stage 3: Per slice — apply mods (your existing layering engine, per charge) Stage 4: Per slice — derived charges (surcharges, taxes, etc., over the result of 2-3) + apply mods to derived (mods can target the surcharge SKU too) ``` Each stage takes the output of the prior stage as input. Stages 2 and 4 are the new pieces; stage 3 is your existing engine, unchanged. ## Stage 2 in detail: rate context + bottom-up rollup The `Priceable` protocol grows one method: ```elixir defprotocol Priceable do @doc "Compute the rate context this node passes down to children." def rate_context(node, parent_ctx, period, ctx) @doc "Emit base charges for this node, given the effective context from ancestors." def base_charges(node, effective_ctx, period, ctx) end ``` A walk over the tree per slice: ```elixir defp rate_subtree(node, parent_ctx, period, ctx) do effective_ctx = Priceable.rate_context(node, parent_ctx, period, ctx) own_charges = Priceable.base_charges(node, effective_ctx, period, ctx) child_charges = Enum.flat_map(Tree.children(node), &rate_subtree(&1, effective_ctx, period, ctx)) own_charges ++ child_charges end ``` For a `BurstableInternet` node: ```elixir defimpl Priceable, for: BurstableInternet do def rate_context(node, parent_ctx, period, ctx) do usage = Usage.fetch(node.id, period) pct_above = Usage.fraction_above(usage, node.committed_speed) effective_speed = if pct_above > 0.05, do: node.burst_speed, else: node.committed_speed Map.merge(parent_ctx, %{ effective_speed: effective_speed, burstable_parent_id: node.id, burstable_decision: %{ committed: node.committed_speed, burst: node.burst_speed, pct_above_committed: pct_above, chose: if(pct_above > 0.05, do: :burst, else: :committed) } }) end def base_charges(_node, _effective_ctx, _period, _ctx) do # The parent itself emits no base charge. Its job is to set context. # The surcharge is a derived charge produced in stage 4. [] end end defimpl Priceable, for: Internet do def rate_context(node, parent_ctx, _period, _ctx), do: parent_ctx # Internet adds no new context, just inherits. def base_charges(node, %{effective_speed: speed}, period, ctx) do sku = SKU.lookup(:internet, %{speed: speed, location: node.location}) rate = Rate.fetch(sku, period, ctx) [Charge.new(sku: sku, period: period, amount: rate, source: source(node, ctx))] end def base_charges(node, _ctx, period, ctx) do # Fallback: standalone Internet (no burstable parent) — uses its own speed. sku = SKU.lookup(:internet, %{speed: node.speed, location: node.location}) rate = Rate.fetch(sku, period, ctx) [Charge.new(sku: sku, period: period, amount: rate, source: source(node, ctx))] end end ``` The `Internet` and `Link/VXC` impls become **pure functions of their effective context**. They don't need to know what made the burst decision — only what speed they're being charged at. The `burstable_decision` blob in the context is preserved for audit, so the eventual charge can carry attribution like *"this Internet was rated at burst speed because the BurstableInternet parent saw 7.3% time above committed, exceeding the 5% threshold."* The whole tree rates correctly with one top-down walk and no special cases for non-burstable Internet (which just inherits an empty/default context and uses its own attrs). ## Stage 4 in detail: derived charges Once base charges + mods are computed for a slice, you have a list of charges per subtree. Now run a second pass over the tree to emit derived charges: ```elixir defprotocol DerivedCharges do @doc """ Emit charges that depend on the charges produced for this subtree. Receives the (post-mod) charges attributed to this node and its descendants. """ def emit(node, subtree_charges, effective_ctx, period, ctx) end defimpl DerivedCharges, for: BurstableInternet do def emit(node, subtree_charges, ctx, period, _ctx) do base_total = subtree_charges |> Enum.map(&Money.to_decimal(&1.amount)) |> Enum.sum() surcharge_pct = SKU.lookup_surcharge(:burstable_internet, period) surcharge_amount = Money.multiply(base_total, surcharge_pct) [ Charge.new( sku: "BURST_SURCHARGE", period: period, amount: surcharge_amount, source: %{node_path: [node.id], node_type: :burstable_internet}, basis: %{ type: :surcharge, surcharge_pct: surcharge_pct, source_charge_ids: Enum.map(subtree_charges, & &1.id) } ) ] end end ``` Three things to notice: - **The surcharge holds explicit references to the source charges it derived from.** This is the audit trail — you can always answer "where did this $X surcharge come from" by following the references. Critical for disputes and customer-facing detail. - **The surcharge is itself a `Charge`**, which means stage 3 (mods) runs on it the same way as on base charges. A reseller might have a discount that excludes surcharges, or specifically targets them — `mod.scope.sku_pattern: ~r/^BURST_/`. Already supported. - **You decide pre-mod or post-mod by where you put stage 4.** As written above, stage 4 runs after mods, so the surcharge is on net. If your product semantics say "surcharge on list price," move stage 4 between stages 2 and 3 (compute surcharge from base charges before mods are applied). The model supports both — make it a per-surcharge declaration. ## How the burst decision interacts with mid-cycle changes The burst rule is **cycle-level** (5% of the *cycle* above committed) but everything else (mods, topology, rate cards) can change mid-cycle. Reconcile by: 1. **The slice boundaries are unchanged** — events, mod activations, rate version changes, topology changes still create slice boundaries. 2. **Within each slice, the burst decision is computed against the slice's effective committed speed**, looking at the slice's portion of usage. So if the customer upgrades their committed speed on day 10, slice [day 1–10] uses old-committed-speed for the threshold check, slice [day 10–30] uses new. 3. **The 5% rule applies per slice**, against that slice's usage. This is the only consistent interpretation when committed speed itself is time-varying. 4. **Document this loudly** because product owners often think of burst as a monthly-bill concept. Your engine treats it correctly per-slice; the docs should explain that mid-cycle changes can produce one slice billed at committed and another at burst. If product wants a different rule ("apply burst to the whole cycle if any slice triggered it"), that's a different `rate_context` implementation — compute the decision once over the whole cycle, then return that decision in every slice's context. Engine doesn't change; only the BurstableInternet impl does. This is the kind of flexibility you get from putting the decision logic inside the node type rather than baking it into the engine. ## How mods compose with derived charges The mod-layering engine already handles ordering, grandfathering, dynamic re-evaluation. It applies to a `Charge`. Surcharges are charges. Done. Two scoping cases worth being explicit about: - **A discount that should NOT apply to surcharges**: `mod.scope.basis_type != :surcharge` (or `sku_pattern` excludes burst SKUs). This is normal — promo codes typically exclude surcharges and taxes. - **A discount that targets the surcharge**: `mod.scope.sku_pattern: ~r/^BURST_/`. Less common but useful for "we're waiving the burst surcharge for Q4". Both fall out of the existing predicate-based scoping. No engine changes. The only real subtlety: **mods on derived charges are applied in stage 4's own mod pass**, not in stage 3. Otherwise, mods on the surcharge would have to be predicted before the surcharge exists. Two mod passes, same engine, ordering preserved by stage. ## Putting the full pipeline in one picture ``` Tree (effective at slice t1..t2) │ ▼ [Stage 2] Top-down rate context walk, bottom-up base charge emission │ ── BurstableInternet decides committed vs burst from usage │ ── Internet, Link/VXC inherit context, emit charges at chosen speed ▼ List (base, attributed to each node) │ ▼ [Stage 3] Per-charge mod layering (your existing engine) │ ── reseller discounts, promos, admin overrides ▼ List (base, post-mod) │ ▼ [Stage 4a] DerivedCharges walk (top-down or per-node) │ ── BurstableInternet emits surcharge = pct × subtree total │ ── (taxes, regulatory fees, eco contributions, etc. fit here too) ▼ List (base post-mod + derived) │ ▼ [Stage 4b] Per-derived-charge mod layering │ ── mods that target surcharges ▼ Final List for this slice │ ▼ Concatenate across slices → full charge stream for the period ``` Each stage is a pure function. The whole pipeline is a pure function of `(tree event log, usage telemetry, rate cards, mods)`. Re-running it on the same inputs gives the same charges. Audit, re-rating, and dispute resolution all work. ## What you have to add to the model you already have Concretely, the deltas: 1. **Usage telemetry as a rating input.** A way to query "what was the bandwidth profile for resource X over interval [t1,t2]" — fraction above threshold, peak, average, etc. Probably already exists for SLA/reporting; reuse it. If not, this is the biggest new system to build. 2. **`rate_context` on the `Priceable` protocol** (or a sibling protocol). Default impl just passes parent_ctx through. Most node types need no change. 3. **`DerivedCharges` protocol.** Implement for `BurstableInternet` first. Tax engines, regulatory fees, late charges all become `DerivedCharges` impls later — same shape. 4. **Stage 4 in the rating runner.** ~30 lines. 5. **Charge → Source-charge references.** Add `basis.source_charge_ids` to the `Charge` schema. This is the audit link. 6. **Slice boundaries gain a "usage profile changed materially" boundary** *only* if you decide some burst-style rule needs sub-slice resolution. For the cycle-level 5% rule, you don't need this — you compute the decision once per slice from the slice's full usage profile. If you ever add a rule like "every minute above 90% utilization is billed at burst", *that* would need sub-slice boundaries, and you'd derive them from usage instead of events. Same boundary-derivation pattern, new input. Item 1 is the unknown. Items 2–5 are mechanical. Item 6 is conditional and worth deferring until a product rule actually needs it. ## The principle this generalizes > **The tree is the structure. Rate context cascades down, costs roll up, derived charges reference base charges, and mods layer over both. All of it is per-slice. None of it is special-case.** Burstable, taxes, regulatory fees, minimum commitments, true-ups, volume tiers, and "package deal" pricing (buy 3 of these, pay for 2) all fit this same shape. The first one you implement (burstable) builds the framework; the rest are protocol implementations. The thing that makes this scale is the discipline: **never short-circuit a stage**. Never let a mod's logic peek at "the original price"; never let a base charge know about a surcharge; never let a derived charge skip the mod pass. Every charge goes through every applicable stage. Predictable behavior is what lets this stay sane as you add the 47th product type. ## User I need you to write a massive markdown document explaining all of this and then push it up to pb-gist. ## Assistant **Tool call: `Write`** ```json { "file_path": "/Users/peter/billing_engine_architecture.md", "content": "# Billing Engine Architecture for Tree-Shaped Products\n\nA reference design for evolving a billing engine from flat resource-to-line-item mapping to one that handles arbitrary product trees, dynamic pricing modifiers, time-varying topology, usage-driven rate decisions, and derived charges (surcharges, taxes) — without losing the audit, grandfathering, and re-rating properties that made the original work.\n\n---\n\n## Table of Contents\n\n1. [The problem](#1-the-problem)\n2. [The architectural reframe](#2-the-architectural-reframe)\n3. [The Charge model](#3-the-charge-model)\n4. [Time as a first-class dimension](#4-time-as-a-first-class-dimension)\n5. [Modifiers](#5-modifiers)\n6. [The rating pipeline](#6-the-rating-pipeline)\n7. [Rate context cascade](#7-rate-context-cascade)\n8. [Derived charges](#8-derived-charges)\n9. [Worked example: Burstable Internet](#9-worked-example-burstable-internet)\n10. [Presentation layer](#10-presentation-layer)\n11. [Idempotency, re-rating, adjustments](#11-idempotency-re-rating-adjustments)\n12. [Implementation roadmap](#12-implementation-roadmap)\n13. [Industry references](#13-industry-references)\n14. [The principles](#14-the-principles)\n\n---\n\n## 1. The problem\n\n### 1.1 The original world: 1:1\n\nThe first generation of the system had a beautifully simple invariant:\n\n```\n1 Resource ↔ 1 Usage Record ↔ 1 Billable Line Item\n```\n\nAudit log entries (`created`, `updated`, `connected`, `deleted`) drove a Usage Record per resource. Each Usage Record had attributes that mapped to a SKU, and each SKU mapped to time-varying prices. A Usage Record for a billing period produced one line on an invoice.\n\nEven superficially complex things stayed simple under this model:\n\n- **Port groups with N children** became N independent ports, each priced as a normal port. The \"group\" was a UI/organizational concept, not a billing concept.\n- **VXC links with A/Z sides** stayed simple because A and Z were *references* to existing resources. Each end was billed on its own ticket; the link itself was just glue.\n\nA pricing modifier engine was built on top:\n\n- A generic predicate (matcher) selected which Usage Records a mod applied to.\n- A generic action transformed the price (percent off, fixed off, override, floor, ceiling).\n- Multiple mod types layered cleanly: promo codes, reseller discounts, admin overrides, custom plans.\n- **Grandfathered** mods were locked at order time and never re-evaluated; the snapshot of rules at that moment defined the price forever.\n- **Dynamic** mods re-evaluated on every billing run; e.g., a reseller hitting a new tier would see discounts update on existing inventory.\n- A time-slicing algorithm split a billing period at every boundary (mod activation, mod expiration, rate version change) and produced per-slice prices that, when summed, gave the period total.\n\nThe math worked. Invariants were strong. Reports were trivially flat tables.\n\n### 1.2 What broke\n\nComplex products changed the shape of reality:\n\n- **A tree with N nodes might produce N billable lines** (surcharges, taxes, regulatory fees, derived charges that aren't owned by any single node).\n- **Modifiers may apply to *part* of a tree**, not the whole thing. A reseller discount on Internet but not on the underlying Link.\n- **Topology changes mid-cycle.** Network spaces of varying sizes can be added to or removed from the Z-side of a Link during a billing period. Each is independently priced and prorated.\n- **Pricing rules can be functions on tree state**, not just attribute lookups. Burstable Internet bills at the burst rate only if usage exceeded the committed rate for >5% of the cycle.\n- **Some pricing decisions reference other charges.** A surcharge that's \"X% of total cost\" requires a roll-up before it can be computed.\n\nThe 1:1 model can't represent any of this without contortions. Trying to keep the 1:1 — by inventing fake \"container\" usage records, or duplicating the tree per-modifier, or pre-flattening trees into virtual line items — produces a system where every new product type breaks something else. The fix isn't a smarter mapping. The fix is to admit that **resource structure and billable charges are independent layers**.\n\n---\n\n## 2. The architectural reframe\n\n### 2.1 Three layers\n\nStop thinking:\n\n```\nResource → Usage Record → Line Item\n```\n\nStart thinking:\n\n```\n ┌─→ Resource Graph (structure, lifecycle)\n │\nSource data ────┤ + Rating Engine ─→ Charges (flat stream, attributed) ─→ Invoice (presentation)\n │\n └─→ Pricing Rules + Modifiers (effective-dated)\n```\n\nThree distinct layers, each with its own time dimension, none of which knows about the others' shape:\n\n- **Resource layer**: trees, nodes, edges, attributes, lifecycle. Source of truth for *what exists when*. Driven by audit events.\n- **Rating layer**: a pure function from `(resource state at time T, pricing rules, mods)` to a flat stream of `Charges`. Has no presentation concerns. Doesn't care what the invoice looks like.\n- **Presentation layer**: takes the charge stream and renders it however the recipient needs — tree-grouped, flat-by-SKU, customer-summary-vs-detail, by cost center, by region, however. Pure function of charges to output.\n\nCritically: **the tree shape becomes metadata on each Charge, not the structure that contains charges.** A port deep inside a Link's Z-side emits its own charges; those charges carry attribution like `source.node_path: [\"link/123\", \"z_side\", \"internet\", \"network_space/4\"]` so they can be grouped by tree position later — but in the rating layer they live in a flat list alongside every other charge in the system.\n\nThis decoupling is what makes everything else possible. It's how AWS Cost and Usage Reports work. It's how every TM Forum-aligned telco BSS works. It's how Metronome and Orb model usage-based SaaS billing. The flat-charge-stream-with-rich-attribution model is what scales.\n\n### 2.2 Why decoupling matters\n\nThree concrete benefits, all of which are immediately visible in your existing pain points:\n\n- **Re-rating is safe.** A pure rating function lets you re-run any historical period and either get the same charges (good) or generate a delta you can issue as adjustments (also good). You never mutate history.\n- **Presentation is cheap.** New report formats, customer-facing detail levels, partner-specific views, regulatory exports — all become transformations over the same charge stream. No new rating logic. No risk of presentation bugs producing different totals.\n- **New products don't change the engine.** Adding a new node type means writing a `Priceable` (and maybe `DerivedCharges`) implementation. The pipeline stays the same. The mod engine stays the same. The presentation layer stays the same.\n\n---\n\n## 3. The Charge model\n\n### 3.1 The Charge as the unit of billing\n\nThe atomic unit of the system is the `Charge`:\n\n```elixir\n%Charge{\n id: \"ch_...\",\n amount: %Money{cents: 4_500, currency: :usd}, # signed; discounts are negative\n period: %Interval{from: ~U[2026-04-01 00:00:00Z], to: ~U[2026-04-15 00:00:00Z]},\n sku: \"PORT_1G_MONTHLY\",\n source: %{\n org_id: \"org_xyz\",\n tree_id: \"link/123\",\n node_path: [\"z_side\", \"internet\", \"network_space/4\"],\n node_type: :network_space,\n resource_id: \"ns_abc\"\n },\n basis: %{\n type: :recurring, # :recurring | :one_time | :usage | :surcharge | :discount | :tax\n base_price: %Money{...},\n quantity: 1,\n rate_id: \"rate_42\", # which rate-card version\n source_charge_ids: [] # for derived charges, which charges they came from\n },\n mods_applied: [\n %{type: :reseller_discount, id: \"mod_1\", effect: %Money{cents: -500, currency: :usd}},\n %{type: :promo_code, id: \"mod_2\", effect: %Money{cents: -200, currency: :usd}}\n ],\n reason: :recurring, # :recurring | :proration | :adjustment | :true_up\n rated_at: ~U[2026-04-30 12:00:00Z]\n}\n```\n\nFive things to internalize about this shape:\n\n1. **Charges don't nest.** Even surcharges that \"apply to the whole tree\" are flat charges with `source.node_path = []` (root). A discount on one node is a separate negative-amount charge with that node's path. There is no tree of charges, ever.\n2. **Mods are line items, not transformations.** Instead of mutating a base price, the engine emits a separate negative charge with the mod's full attribution. Audit, refunds, \"why did I get this discount\" become trivially answerable. Display can collapse base+discount into a net line; the data layer keeps both.\n3. **Attribution is rich.** Every charge knows where it came from in the resource graph, which rate version produced it, which mods touched it, and (for derived charges) which other charges it was computed from. This is the audit trail.\n4. **`>N from N` and ` applicable_to(base)\n |> Enum.map(&mod_charge_for(&1, base))\n[base | mod_charges]\n```\n\nA 10% reseller discount on a $50 port produces:\n\n```\n%Charge{sku: \"PORT_1G_MONTHLY\", amount: $50.00, basis: %{type: :recurring}}\n%Charge{sku: \"DISCOUNT_RESELLER_T2\", amount: -$5.00, basis: %{type: :discount, source_charge_ids: [\"ch_port_1\"]}}\n```\n\nNet = $45. Display can collapse to \"Port (10% reseller discount): $45\". Detail view shows both. Disputes can refund just the base. Refunding the discount alone is also possible. Every dimension of the system gets simpler.\n\n### 3.3 Idempotency\n\nThe rating function is deterministic given its inputs:\n\n```\ncharges = Rating.rate(tree_event_log, usage_telemetry, rate_cards, mods, period)\n```\n\nSame inputs, same charges. Always. This single property unlocks:\n\n- **Safe re-rating.** Re-run for any historical period; diff against stored charges; emit deltas as adjustments.\n- **Dry runs.** Before a rule change, run rating against a future period with the new rules to see impact.\n- **Backfills.** A new product type can re-rate historical periods and emit catch-up charges with `reason: :true_up`.\n- **Testing.** Rating is a pure function; tests are obvious to write.\n\nThe cost is that every input must be effective-dated and immutable-as-of-a-time. No mutable global state in the rating function. No \"current rate card\" lookup that changes between runs. Every input has an as-of read.\n\n---\n\n## 4. Time as a first-class dimension\n\n### 4.1 Effective dating, everywhere\n\nEvery input to rating must answer the question \"what was your value at time T\":\n\n| Input | Effective-date carrier |\n| ------------------ | ------------------------------------------------------------------------- |\n| Resource topology | Audit log (created/updated/deleted/connected events) |\n| Tree membership | Edges with `effective_from` / `effective_to` |\n| Node attributes | Audit log entries with timestamps |\n| Rate cards | Rate versions with `valid_from` / `valid_to` |\n| Modifiers | Mod activations with effective intervals + grandfathering snapshots |\n| Usage telemetry | Time-series with timestamps |\n| Customer state | Customer attribute history (tier changes, plan changes) |\n\nThere are no exceptions. \"Current\" values are queries, not stored fields. This is non-negotiable; the moment a single input becomes \"the current value, mutated in place,\" re-rating breaks and grandfathering becomes impossible to reason about.\n\n### 4.2 Time slicing\n\nRating a period `[t_start, t_end]` for a resource works by:\n\n1. **Collecting all boundary events** in the period from every effective-dated input.\n2. **Sorting + deduplicating** boundaries into a list of slice points `[t_start, b_1, b_2, ..., t_end]`.\n3. **For each slice `[t_i, t_{i+1}]`**, computing a snapshot of every input as-of `t_i`, then running the rating pipeline against that snapshot for that interval.\n4. **Concatenating** the per-slice charges into the period's charge stream.\n\nThe slice is the unit of pure computation. Within a slice, nothing changes. Across slices, anything can change.\n\n### 4.3 Sources of slice boundaries\n\n```\nBoundary derivation =\n ∪ tree topology change events (node added, removed, re-parented)\n ∪ node attribute change events (size changed, location changed → new SKU)\n ∪ rate card version transitions (price changed)\n ∪ mod activation/expiration events (discount applied, promo expired)\n ∪ customer state change events (reseller tier changed)\n ∪ (optional) usage threshold crossings (only if a rule needs sub-slice resolution)\n```\n\nThe total cost is linear in the number of boundary events, *not* exponential. You're not generating Cartesian products of states; you're walking a sorted timeline.\n\n### 4.4 Membership changes are just slicing\n\nWhen network spaces are added to or removed from the Z-side of a Link during a billing period, this is structurally identical to a price change: it's an event at time `t` that changes one of rating's inputs.\n\nDay 5: network_space/4 added (size: /29).\nDay 12: network_space/4 size changed to /28.\nDay 20: network_space/4 removed.\n\nFor an April billing period:\n\n| Slice | Network spaces present | network_space/4 SKU |\n| --------------- | ---------------------- | -------------------------------------- |\n| Apr 1 – Apr 5 | (none) | — |\n| Apr 5 – Apr 12 | network_space/4 | NS_29_MONTHLY |\n| Apr 12 – Apr 20 | network_space/4 | NS_28_MONTHLY |\n| Apr 20 – Apr 30 | (none) | — |\n\nEach slice is rated independently. Charges are emitted per slice with `period` set to that slice's interval. The bill prorates correctly without any special \"membership change\" logic — it's just slicing.\n\n### 4.5 Cycle-level vs slice-level decisions\n\nSome pricing rules are evaluated per-slice (everything described so far). Some are evaluated per-cycle (the entire billing period). The model supports both.\n\nPer-slice example: \"if a reseller is at tier 2 during this slice, apply a 5% discount.\" Decision is local to the slice.\n\nPer-cycle example: \"Burstable Internet bills at burst rate if >5% of the *cycle* was above the committed rate.\" Decision is global to the period, then projected onto every slice within it.\n\nThe latter is implemented by computing the decision once over the full period's input data, then passing the result into every slice's rate context. The slicing engine doesn't need to know which rules are which — the node's `rate_context/4` impl decides whether to look at slice-local or period-global data.\n\n---\n\n## 5. Modifiers\n\n### 5.1 Scope predicates\n\nIn the 1:1 era, mod scope was implicit: a mod applied to a Usage Record. Now scope must be explicit because charges come from many shapes of nodes.\n\n```elixir\n%PriceMod{\n id: \"mod_42\",\n type: :reseller_discount,\n scope: %{\n org_id: \"org_x\",\n node_type: :port, # only port nodes\n sku_pattern: ~r/^PORT_\\d+G/, # only port SKUs\n tree_path_prefix: [\"link/123\"], # only under this tree (optional)\n basis_type: [:recurring, :usage] # not surcharges or taxes (optional)\n },\n effect: %{type: :percent, value: -10},\n effective: %Interval{from: ~U[...], to: ~U[...]},\n grandfathered?: true,\n ordering_priority: 100\n}\n```\n\nAt rating time, for each charge, find all applicable mods by matching the predicate. Apply via your existing layering algorithm. The grandfathered/dynamic distinction is unchanged: grandfathered mods have their scope and effect snapshotted at order time and ignore later rule changes; dynamic mods re-evaluate from current state every run.\n\n### 5.2 Layering rules (carried forward)\n\nYour existing layering engine handles:\n\n- **Order independence** for commutative mods (multiple flat-percent discounts in the same priority bucket).\n- **Order dependence** for non-commutative mods (override → percent off → cap).\n- **Floor/ceiling** clamping after each layer.\n- **Stop conditions** (a mod that says \"no further mods apply\").\n\nThis logic is unchanged. It just runs on `Charge`s instead of `UsageRecord`s. The interface to it is: given a charge and an ordered list of applicable mods, return the charge plus a list of mod-charge entries that, when summed, give the net amount.\n\n### 5.3 Performance and indexing\n\nAt your scale (17k products × N mods across 2k orgs), naive predicate evaluation becomes expensive: every charge × every mod = a lot of pattern matches. Bucket mods at load time:\n\n```elixir\nmod_index = %{\n by_org: %{\"org_x\" => [...]},\n by_node_type: %{:port => [...], :network_space => [...]},\n by_sku_prefix: %{\"PORT_\" => [...], \"NS_\" => [...]}\n}\n```\n\nFor each charge, look up candidate mods by intersecting buckets, then run full predicates on the candidates. Cuts mod-evaluation cost from O(M) to O(M/B) where B is the bucket count.\n\nThe DSL for predicates is worth investing in:\n\n```elixir\nmod_scope do\n org_id == \"org_x\"\n node_type in [:port, :network_space]\n sku =~ ~r/^PORT_/\n tree_path_starts_with([\"link/123\"])\nend\n```\n\nCompiles to both an evaluator function and a structured form for indexing. Pays off as scope predicates get hairier.\n\n---\n\n## 6. The rating pipeline\n\nThe pipeline is four stages:\n\n```\n┌──────────────────────────────────────────────────────────────────┐\n│ Stage 1: Slice the period │\n│ Inputs: tree event log, mod intervals, rate versions, period │\n│ Output: list of slices [(t_0, t_1), (t_1, t_2), ...] │\n└─────────────────────────────┬────────────────────────────────────┘\n │\n ▼ (per slice)\n┌──────────────────────────────────────────────────────────────────┐\n│ Stage 2: Base charges │\n│ Walk tree top-down: build rate context (cascade) │\n│ Walk tree bottom-up: emit base charges per node │\n│ Output: list of base Charges for the slice │\n└─────────────────────────────┬────────────────────────────────────┘\n │\n ▼\n┌──────────────────────────────────────────────────────────────────┐\n│ Stage 3: Mod layering on base charges │\n│ For each base charge, find applicable mods, apply layering │\n│ Output: base + discount/surcharge mod charges │\n└─────────────────────────────┬────────────────────────────────────┘\n │\n ▼\n┌──────────────────────────────────────────────────────────────────┐\n│ Stage 4a: Derived charges │\n│ Walk tree, emit derived charges that reference subtree totals │\n│ Examples: percent surcharge, regulatory fee, taxes │\n└─────────────────────────────┬────────────────────────────────────┘\n │\n ▼\n┌──────────────────────────────────────────────────────────────────┐\n│ Stage 4b: Mod layering on derived charges │\n│ Same engine as Stage 3, applied to derived charges │\n│ Allows mods that target surcharges (e.g., \"waive Q4 burst\") │\n└─────────────────────────────┬────────────────────────────────────┘\n │\n ▼\n Charges for slice\n │\n ▼ (concat across slices)\n Charges for period\n```\n\nEach stage is a pure function. The pipeline as a whole is deterministic. Re-running with the same inputs produces the same charges.\n\n### 6.1 Why not collapse stages\n\nIt's tempting to fold mods into base charge emission (\"just compute the final price in `base_charges/4`\"). Don't. Two reasons:\n\n- **Mods that target surcharges break.** The surcharge doesn't exist until stage 4. If mods are baked into stage 2, they can't see derived charges.\n- **Audit becomes ambiguous.** \"What was the list price before discounts\" requires the base charge to exist independently of the discount charge. Audit, dispute, customer-detail views all want this separation.\n\nTwo mod passes (stages 3 and 4b) using the same engine is the right answer.\n\n---\n\n## 7. Rate context cascade\n\n### 7.1 The CSS analogy\n\nA node deep in a tree often needs to know something computed by an ancestor — typically \"what speed/SKU am I priced at\" when an ancestor's rule overrides naive attribute lookup. This is the same shape as CSS cascade, OpenTelemetry trace context propagation, or React render context: a value computed at one level flows down and parameterizes computation at lower levels.\n\nThe mechanism is a per-node `rate_context/4` function that produces a context map, which is merged with parent context and passed to descendants:\n\n```elixir\ndefprotocol Priceable do\n @doc \"\"\"\n Return the rate context this node passes down to children.\n Default impl: pass parent_ctx through unchanged.\n \"\"\"\n def rate_context(node, parent_ctx, period, ctx)\n\n @doc \"\"\"\n Emit base charges for this node, given the effective context from ancestors.\n \"\"\"\n def base_charges(node, effective_ctx, period, ctx)\nend\n```\n\nA traversal:\n\n```elixir\ndefp rate_subtree(node, parent_ctx, period, ctx) do\n effective_ctx = Priceable.rate_context(node, parent_ctx, period, ctx)\n own_charges = Priceable.base_charges(node, effective_ctx, period, ctx)\n child_charges = Enum.flat_map(Tree.children(node), &rate_subtree(&1, effective_ctx, period, ctx))\n own_charges ++ child_charges\nend\n```\n\nThat's the whole mechanism. Most node types implement `rate_context/4` as `fn _, parent, _, _ -> parent end` (pass-through). Only nodes that *make decisions* override it.\n\n### 7.2 What goes in the context\n\nThe context map is whatever ancestors decided that descendants need. Common entries:\n\n- `effective_speed` — burstable parent decided burst-vs-committed.\n- `bundle_id` — a package node says \"you're inside Bundle X; emit no individual charges.\"\n- `parent_discount_pct` — a parent applied a wholesale discount to its subtree.\n- `currency_override` — a parent in a multi-currency arrangement forces children to a specific currency.\n- `tax_jurisdiction` — a parent location node propagates tax rules.\n\nAlways include audit information for the decisions that were made:\n\n```elixir\n%{\n effective_speed: 10_000,\n burstable_decision: %{\n committed: 1_000,\n burst: 10_000,\n pct_above_committed: 0.073,\n chose: :burst,\n threshold: 0.05\n }\n}\n```\n\nThe descendants don't need this for computation, but their charges should carry it forward into `basis` so the bill explains itself: *\"this Internet was rated at burst speed because the BurstableInternet parent saw 7.3% time above committed, exceeding the 5% threshold.\"*\n\n---\n\n## 8. Derived charges\n\n### 8.1 The protocol\n\nSome charges are functions of other charges: surcharges, taxes, regulatory fees, minimum-commitment true-ups. They can't be emitted in stage 2 because their inputs (the base charges) don't exist yet. Stage 4 handles them via a separate protocol:\n\n```elixir\ndefprotocol DerivedCharges do\n @doc \"\"\"\n Emit charges derived from this subtree's already-computed charges.\n Receives the post-mod charges attributed to this node and its descendants.\n \"\"\"\n def emit(node, subtree_charges, effective_ctx, period, ctx)\nend\n```\n\nA walk over the tree per slice, with each node receiving the charges produced under it.\n\n### 8.2 Source charge references\n\nEvery derived charge stores the IDs of the charges it was computed from:\n\n```elixir\n%Charge{\n sku: \"BURST_SURCHARGE\",\n amount: %Money{cents: 1_250}, # 10% of $125 of subtree base\n source: %{node_path: [\"burstable/9\"], node_type: :burstable_internet},\n basis: %{\n type: :surcharge,\n surcharge_pct: 0.10,\n source_charge_ids: [\"ch_internet_42\", \"ch_link_43\", \"ch_ns_44\", \"ch_ns_45\"]\n }\n}\n```\n\nThis is the audit trail. Any disputed surcharge can be traced back to its inputs. Refunding a base charge can trigger a downstream refund of the surcharge proportionally. Customer-facing detail can show the derivation.\n\n### 8.3 Pre-mod vs post-mod\n\nWhere in the pipeline a derived charge is computed determines what it includes:\n\n- **Stage 4 (post-mod)**: derived from base charges + their mod charges. Surcharge is a percentage of *net*. Customer pays X% of what they actually paid.\n- **Between stage 2 and stage 3 (pre-mod)**: derived from base charges only. Surcharge is a percentage of *list*. Independent of customer-specific discounts.\n\nBoth are real product semantics. Make it a per-derived-charge declaration:\n\n```elixir\ndefimpl DerivedCharges, for: BurstableInternet do\n def emit(node, subtree_charges, _ctx, period, ctx) do\n [\n %Charge{\n sku: \"BURST_SURCHARGE\",\n basis: %{\n type: :surcharge,\n calculation_basis: :net, # or :list\n surcharge_pct: ctx.surcharge_rate\n },\n ...\n }\n ]\n end\nend\n```\n\nThe pipeline runner chooses where to insert each derived-charge pass based on the declaration. Simple to implement, and lets product owners pick semantics per-surcharge without touching the engine.\n\n### 8.4 Common shapes that fit\n\nThe same protocol handles a lot:\n\n| Use case | Source charges | Derivation |\n| ------------------------- | ------------------------------------- | -------------------------------------------- |\n| Burstable surcharge | Subtree of the burstable node | percent × subtotal |\n| Tax | All taxable charges in jurisdiction | tax_rate(jurisdiction) × subtotal |\n| Regulatory fee | Charges of certain SKU classes | flat_per_unit × count, or percent |\n| Minimum commitment fee | All charges in commitment scope | max(0, commitment - subtotal) |\n| Volume tier rebate | Charges in tier scope | tier_table_lookup(subtotal) |\n| Bundle discount | Charges of bundled SKUs | bundle_price - sum(individual_prices) |\n| Late payment fee | Charges in unpaid invoice | percent × overdue_amount |\n\nAll implemented as `DerivedCharges` impls. Engine doesn't change. Only the protocol implementations grow.\n\n---\n\n## 9. Worked example: Burstable Internet\n\nA complete walkthrough of the most complex product to date.\n\n### 9.1 Product structure\n\nA `BurstableInternet` node with `committed_speed = 1G` and `burst_speed = 10G` has two children:\n\n- An `Internet` node configured at burst_speed (10G), with its own subtree of network spaces, etc.\n- A `Link` (VXC) at burst_speed with A-side reference and Z-side reference (the Internet above).\n\nTree shape:\n\n```\nBurstableInternet (committed=1G, burst=10G)\n├─ Internet (speed=10G)\n│ └─ NetworkSpace (size=/29)\n│ └─ NetworkSpace (size=/28)\n└─ Link/VXC (speed=10G)\n ├─ A side → (existing customer port)\n └─ Z side → (the Internet above)\n```\n\n### 9.2 Pricing rules\n\n- The Internet, network spaces, and Link each have their own SKUs that depend on speed.\n- The BurstableInternet parent applies a **surcharge of 10% of total subtree cost**.\n- The whole subtree is **billed at the committed speed by default**, but billed at burst speed for any cycle in which usage exceeded the committed rate for >5% of the cycle.\n- Standard mods (reseller discounts, promos) can apply to any or all of the above.\n\n### 9.3 Stage-by-stage trace\n\n**Setup**: April 2026 billing cycle. Usage telemetry says 7.3% of cycle was above 1G. Reseller discount of 10% applied to ports and links (not Internet, not surcharges). Promo \"WAIVE_BURST_Q2\" waives the burst surcharge.\n\n**Stage 1: Slice the period.** No mid-cycle topology changes, no mid-cycle rate changes, no mid-cycle mod changes. Single slice: `[Apr 1, Apr 30]`.\n\n**Stage 2: Base charges.**\n\nTop-down rate context cascade:\n\n1. Tree root processes first. `BurstableInternet.rate_context/4`:\n ```elixir\n def rate_context(node, parent_ctx, period, _ctx) do\n usage = Usage.fetch(node.id, period)\n pct_above = Usage.fraction_above(usage, node.committed_speed)\n\n effective_speed =\n if pct_above > 0.05, do: node.burst_speed, else: node.committed_speed\n\n Map.merge(parent_ctx, %{\n effective_speed: effective_speed,\n burstable_parent_id: node.id,\n burstable_decision: %{\n committed: node.committed_speed,\n burst: node.burst_speed,\n pct_above_committed: pct_above,\n chose: :burst,\n threshold: 0.05\n }\n })\n end\n ```\n Returns: `effective_speed: 10G` (burst, because 7.3% > 5%).\n\n2. `Internet`, `NetworkSpace`, and `Link` all use the default `rate_context` (pass-through). Each inherits `effective_speed: 10G`.\n\nBottom-up base charge emission (per node):\n\n- `BurstableInternet.base_charges/4` returns `[]` (parent itself has no base charge, only the surcharge in stage 4).\n- `Internet.base_charges/4`:\n ```elixir\n def base_charges(node, %{effective_speed: speed}, period, ctx) do\n sku = SKU.lookup(:internet, %{speed: speed}) # → INTERNET_10G_MONTHLY\n rate = Rate.fetch(sku, period, ctx) # → $80\n [%Charge{sku: sku, amount: %Money{cents: 8_000}, period: period, ...}]\n end\n ```\n- `NetworkSpace` (the /29) → SKU `NS_29_MONTHLY` → $15.\n- `NetworkSpace` (the /28) → SKU `NS_28_MONTHLY` → $25.\n- `Link.base_charges/4` → SKU `LINK_10G_MONTHLY` → $40.\n\nBase charges after stage 2:\n\n| Charge | SKU | Amount |\n| --------------- | -------------------- | -------- |\n| ch_internet_1 | INTERNET_10G_MONTHLY | +$80.00 |\n| ch_ns_1 | NS_29_MONTHLY | +$15.00 |\n| ch_ns_2 | NS_28_MONTHLY | +$25.00 |\n| ch_link_1 | LINK_10G_MONTHLY | +$40.00 |\n\nSubtotal: $160.00. Each carries `basis.rate_id`, source attribution, and the inherited `burstable_decision` for audit.\n\n**Stage 3: Mod layering on base charges.**\n\nReseller discount mod scope: `node_type in [:port, :link]`, applies 10% off.\n\n- ch_internet_1: no applicable mod.\n- ch_ns_1, ch_ns_2: no applicable mod.\n- ch_link_1: matches. Emits a discount charge.\n\n```elixir\n%Charge{\n sku: \"DISCOUNT_RESELLER_T2\",\n amount: %Money{cents: -400}, # -$4.00\n basis: %{type: :discount, source_charge_ids: [\"ch_link_1\"]},\n ...\n}\n```\n\nAfter stage 3:\n\n| Charge | SKU | Amount |\n| ------------------------ | ---------------------- | --------- |\n| ch_internet_1 | INTERNET_10G_MONTHLY | +$80.00 |\n| ch_ns_1 | NS_29_MONTHLY | +$15.00 |\n| ch_ns_2 | NS_28_MONTHLY | +$25.00 |\n| ch_link_1 | LINK_10G_MONTHLY | +$40.00 |\n| ch_link_1_discount | DISCOUNT_RESELLER_T2 | -$4.00 |\n\nNet subtotal: $156.00.\n\n**Stage 4a: Derived charges.**\n\n`BurstableInternet.emit/5` runs against its subtree's post-mod charges:\n\n```elixir\ndef emit(node, subtree_charges, ctx, period, _ctx) do\n base_total = subtree_charges |> Enum.map(& &1.amount) |> Money.sum()\n surcharge_amount = Money.multiply(base_total, ctx.surcharge_rate) # 10%\n\n [\n %Charge{\n sku: \"BURST_SURCHARGE\",\n amount: surcharge_amount,\n period: period,\n source: %{node_path: [node.id], node_type: :burstable_internet},\n basis: %{\n type: :surcharge,\n calculation_basis: :net,\n surcharge_pct: ctx.surcharge_rate,\n source_charge_ids: Enum.map(subtree_charges, & &1.id)\n }\n }\n ]\nend\n```\n\nSubtotal $156 × 10% = $15.60.\n\nAfter stage 4a:\n\n| Charge | Amount |\n| ------------------------ | --------- |\n| (the four base charges) | +$160.00 |\n| ch_link_1_discount | -$4.00 |\n| ch_burst_surcharge_1 | +$15.60 |\n\n**Stage 4b: Mod layering on derived charges.**\n\n`WAIVE_BURST_Q2` promo scope: `sku_pattern: ~r/^BURST_SURCHARGE/`, applies 100% off.\n\nMatches ch_burst_surcharge_1. Emits:\n\n```elixir\n%Charge{\n sku: \"DISCOUNT_PROMO_WAIVE_BURST_Q2\",\n amount: %Money{cents: -1_560}, # -$15.60\n basis: %{type: :discount, source_charge_ids: [\"ch_burst_surcharge_1\"]},\n ...\n}\n```\n\nFinal charge stream for the slice:\n\n| Charge | Amount |\n| ----------------------------------------- | --------- |\n| ch_internet_1 | +$80.00 |\n| ch_ns_1 | +$15.00 |\n| ch_ns_2 | +$25.00 |\n| ch_link_1 | +$40.00 |\n| ch_link_1_discount | -$4.00 |\n| ch_burst_surcharge_1 | +$15.60 |\n| ch_burst_surcharge_1_promo_waiver | -$15.60 |\n\nNet total: **$156.00**.\n\nEvery charge carries full attribution. The bill explains itself at any level of detail.\n\n### 9.4 Mid-cycle changes\n\nSuppose on Apr 15 the customer added a `/30` network space. Stage 1 produces two slices: `[Apr 1, Apr 15]` and `[Apr 15, Apr 30]`. Each slice runs the full pipeline independently:\n\n- Slice 1: same as above but no /30 charges.\n- Slice 2: includes a prorated /30 charge.\n\nThe burst decision: per the cycle-level rule, the engine examines the *full cycle's* usage to make the burst-vs-committed decision once, then injects that decision into both slices' contexts. (Or, if the rule were per-slice, it would be re-decided per slice. Both are supported by where you compute the decision.)\n\nThe surcharge in each slice is computed from that slice's own subtree totals and prorated naturally.\n\n### 9.5 Adding a new burstable variant\n\nSay product wants a new \"Burstable Internet Premium\" with the same shape but different rules: 8% threshold instead of 5%, and the surcharge calculation_basis is `:list` instead of `:net`. Implementation:\n\n- Define a new node type `BurstableInternetPremium`.\n- Implement `rate_context/4` with `0.08` instead of `0.05` (or parameterize on node attrs).\n- Implement `DerivedCharges.emit/5` with `calculation_basis: :list`.\n- Done. Pipeline doesn't change. Mod engine doesn't change. Presentation doesn't change.\n\nThis is the test of whether the architecture is right: can a new variant be added without engine changes? If yes, the abstractions are at the right level.\n\n---\n\n## 10. Presentation layer\n\nThe flat charge stream is transformed for display. Common transformations:\n\n### 10.1 Tree-grouped invoice\n\n```elixir\ncharges\n|> Enum.group_by(&{&1.source.tree_id, &1.source.node_path})\n|> render_with_subtotals(tree_snapshot)\n```\n\nRenders charges nested under their resource graph position. Useful for technical customers who want to see exactly what each piece of their infrastructure cost.\n\n### 10.2 Flat invoice by SKU\n\n```elixir\ncharges\n|> Enum.group_by(& &1.sku)\n|> Enum.map(&aggregate_with_count_and_total/1)\n```\n\nA traditional itemized invoice. Useful for finance teams reconciling against POs.\n\n### 10.3 Customer-friendly summary\n\n```elixir\ncharges\n|> Enum.group_by(&category(&1.basis.type))\n|> roll_up_with_subtotals()\n```\n\nCategories like \"Recurring services,\" \"Usage charges,\" \"Discounts,\" \"Taxes & fees.\" For non-technical readers.\n\n### 10.4 Cost-center / tag-based view\n\nIf `Charge.source` carries customer-defined tags (cost center, project, environment), grouping by those gives showback / chargeback views.\n\n### 10.5 Why presentation independence matters\n\nAll of these render from the same charge data. If totals don't reconcile across views, that's a presentation bug, not a billing bug — the source of truth is unambiguous. Adding a new presentation never requires re-running rating. Customer requests for new report formats become 50-line transformations, not engine changes.\n\nThe rule: **never let presentation logic do arithmetic that wasn't done by the rater.** Presentation can sum, average, and group. It must never compute prices, apply discounts, or invent line items. If a customer needs a new derived value, it becomes a charge in stage 4, not a calculation in the renderer.\n\n---\n\n## 11. Idempotency, re-rating, adjustments\n\n### 11.1 The contract\n\n```\nRating.rate(inputs, period) -> charges\n```\n\nis a pure function. Two invocations with the same inputs produce the same charges, byte-for-byte (modulo IDs and timestamps). This is the contract.\n\n### 11.2 Re-rating workflow\n\nWhen an input changes (a rule was wrong, a usage feed was corrected, a topology event was backfilled), re-rating proceeds:\n\n1. **Snapshot the current charges** for the affected period as `previous_charges`.\n2. **Re-run the rating function** with corrected inputs as `current_charges`.\n3. **Diff** by `(source.resource_id, sku, period)`:\n - Charges in `current_charges` but not `previous_charges` → emit as new with `reason: :adjustment`, positive amount.\n - Charges in `previous_charges` but not `current_charges` → emit reversal with `reason: :adjustment`, negative amount.\n - Amount changes → emit a delta charge.\n4. **Land the deltas in the next billing cycle** (or in a corrective invoice). Never mutate the historical period's charges.\n\nThe charges from step 3 carry references to the charges they replace/correct, so audit can always show the chain.\n\n### 11.3 Never mutate\n\nThree things, ever:\n\n- Never UPDATE a Charge. Always INSERT new ones with adjustment semantics.\n- Never re-emit historical invoices. Issue corrective invoices that reference the originals.\n- Never change a stored mod's effect or scope. Expire it (set `effective.to`) and create a new mod with the new values.\n\nMutation breaks every property of the system. Re-rating becomes ambiguous; audit becomes a lie; refunds become impossible to reason about. Every \"I just need to fix this one value\" is the start of a six-month untangling.\n\n### 11.4 Dispute resolution\n\nWhen a customer disputes a charge from four months ago:\n\n1. Identify the disputed charges by ID.\n2. Investigate the root cause — was the input wrong, or the rule, or the implementation?\n3. Correct the input (or fix the rule going forward).\n4. Re-rate the affected period.\n5. Diff produces adjustment charges, which land in the current period.\n6. Customer-facing communication references the original charge IDs and the adjustment IDs.\n\nThe audit trail makes this conversation precise: \"On April 7 you were charged $X under rule Y; we determined that should have been $Z, here's a credit of $X-Z applied to your May invoice.\"\n\n---\n\n## 12. Implementation roadmap\n\n### 12.1 What to build, in order\n\n1. **Define `%Charge{}` schema** with full attribution fields. Get the team aligned that this is the unit of billing. Persistence layer too — charges are first-class records, not derived blobs.\n2. **Implement `Priceable.base_charges/4`** for the simplest existing node type. Skip rate context; pass-through the parent context. Validate end-to-end: tree state → flat charges → stored.\n3. **Port the existing mod layering engine** to operate on charges. This is largely mechanical — the layering math doesn't change, only the input/output shape does.\n4. **Add boundary slicing for tree topology changes**. Get membership-change proration working on a non-burstable case (e.g., a network space added mid-cycle).\n5. **Build presentation transformations** for at least two views — tree-grouped and flat-by-SKU. Validate they reconcile. This forces the model to be right.\n6. **Add `Priceable.rate_context/4`** with default pass-through. No node uses it yet; just establish the cascade machinery.\n7. **Add `DerivedCharges` protocol and stage 4 of the pipeline**. Implement first for taxes (the simplest derived charge). Validate end-to-end.\n8. **Implement `BurstableInternet`** as the proof-of-concept for `rate_context` and `DerivedCharges` together. Use the worked example in section 9 as the integration test.\n9. **Add usage telemetry as a rating input** (if it isn't already). The burst-rate decision needs to query \"fraction of cycle above committed speed\" for a resource over a period. Reuse SLA reporting if possible.\n10. **Add adjustment / re-rate workflow**. This is operational, not architectural — but build it before you need it in anger.\n\n### 12.2 What to defer\n\n- **Sub-slice usage-driven boundaries.** No current rule needs them. Add when a product spec demands it.\n- **Predicate DSL for mod scopes.** Start with structured maps. Build the DSL when scope predicates become annoying to read.\n- **Mod indexing.** Naive iteration is fine until profiling says otherwise.\n- **New presentation views.** Build the two needed for launch; add others as customers ask.\n\n### 12.3 What not to do\n\n- **Do not invent fake \"container\" charges to preserve 1:1.** This is the trap that broke the original model.\n- **Do not duplicate trees per modifier.** Modifiers attach to charges, not to tree shapes.\n- **Do not bake mods into base charge emission.** Stage separation is what allows mods on derived charges to work.\n- **Do not allow mutable state in rating.** Every input must be effective-dated and queryable as-of-a-time.\n\n---\n\n## 13. Industry references\n\nWorth reading or skimming. None of these will give you a drop-in answer, but each demonstrates a piece of the architecture in production.\n\n- **TM Forum SID (Shared Information/Data Model).** Telco standard. Look at the Product / Service / Resource separation, and the Customer Bill Inquiry / Charge / Adjustment entity model. The decomposition is the value.\n- **Stripe Billing.** Subscriptions, Subscription Items, Invoice Items, Credit Notes. Especially the way an Invoice Item is just a queued line that lands on the next invoice — independent from the subscription that \"owns\" it. Charges-as-events made customer-friendly.\n- **Metronome** and **Orb**. Usage-event-based billing companies. Engineering blogs publicly describe how they decouple events from invoices. Closest in shape to your model.\n- **AWS Cost and Usage Report (CUR).** The schema has 100+ columns of attribution per line. Demonstrates how far flat-charges-with-rich-tags can be pushed.\n- **Killbill** (open source, Java). Reference billing engine. Don't adopt it, but the data model — particularly subscription bundles, catalogs with effective dates, and the credit/adjustment flows — is instructive.\n- **\"Patterns of Enterprise Application Architecture\"** (Fowler). The Effective Date and Temporal Object patterns formalize what you're already doing informally. Useful vocabulary.\n- **\"Enterprise Integration Patterns\"** (Hohpe, Woolf). The pipeline/stage architecture in section 6 is a Pipes-and-Filters pattern. Reference it when explaining the design.\n\n---\n\n## 14. The principles\n\nFive sentences that, taken together, describe the entire architecture:\n\n1. **Resource structure and billable charges are independent layers.** The tree is the substrate; charges are the output. Tree shape becomes attribution metadata on flat charges.\n2. **A Charge is the atomic unit of billing.** Mods are charges, surcharges are charges, taxes are charges, adjustments are charges. There is no other unit.\n3. **Time is a first-class dimension.** Every input is effective-dated. Rating slices the period at every boundary and treats each slice as a pure computation.\n4. **Rating is a multi-stage pipeline.** Top-down rate context cascades, bottom-up base charges roll up, mods layer onto base charges, derived charges reference base charges, and mods layer onto derived charges. Each stage is pure and the order is fixed.\n5. **Charges are immutable; presentation is a transformation.** Re-rating produces adjustments, never mutations. Presentation layers compute zero arithmetic that wasn't done by the rater.\n\nThe system that follows these five rules can absorb every product type and pricing rule the business invents, without engine changes. The system that violates any one of them will rot under its own complexity within six months. Every architectural decision, code review, and feature design should be checkable against these rules.\n\n> **Charges are events emitted by the rating function over (resource state × rules × mods × time). They live in a flat stream with rich attribution. Trees, mods, and presentations are different views over the same charge stream.**\n" } ``` ## User **Tool result:** ``` File created successfully at: /Users/peter/billing_engine_architecture.md ``` ## Assistant **Tool call: `Bash`** ```json { "command": "PB_GIST_AI_MODE=1 pb-gist create --title \"Billing engine architecture for tree-shaped products\" --description \"Design reference for evolving a 1:1 resource-to-line-item billing engine into a multi-stage rating pipeline that handles product trees, time-varying topology, dynamic and grandfathered modifiers, usage-driven rate decisions (e.g. Burstable Internet), and derived charges (surcharges, taxes). Covers the Charge model, rate-context cascade, derived-charge protocol, four-stage pipeline, idempotent re-rating, and a complete worked example.\" --visibility private --file billing_engine_architecture.md=/Users/peter/billing_engine_architecture.md", "description": "Push billing architecture writeup as private gist" } ``` ## User **Tool result:** ``` ✓ created https://gist.peterbarr.io/peter/ed14e92422d84128afaac5c880782677 ``` ## Assistant Up: https://gist.peterbarr.io/peter/ed14e92422d84128afaac5c880782677 ~14 sections covering the full arc — the 1:1 era and what broke, the three-layer reframe, the `Charge` model, time-slicing, mod scope predicates, the four-stage rating pipeline, rate context cascade, derived charges, a complete Burstable Internet trace through every stage, presentation-layer independence, idempotent re-rating, an implementation roadmap, industry references, and the five governing principles. ## User Is there anything off the shelf for this problem? ## Assistant Honest answer: **for the rating engine specifically, no — nothing off-the-shelf models tree-shaped products with cascade context and derived charges the way you need.** But for the components *around* the rating engine, plenty exists and you should not build them. ### What exists, by category **Carrier BSS (would technically fit, wildly oversized):** - Oracle BRM (formerly Portal Infranet) - Amdocs CES - NetCracker BSS - MATRIXX (5G-era real-time charging) - Ericsson Charging - CSG Singleview These all handle hierarchical products, time-sliced rating, complex modifier algebras, grandfathering, and derived charges. They're built for tier-1 telcos. Six- to seven-figure license floors, six- to twelve-month implementations, professional-services-heavy. The data models are roughly what you're building. The reason you don't see them outside large carriers is they're priced and shaped for that market — you'd spend more integrating one than building yours, and you'd inherit a 20-year-old data model designed for monthly cellular billing. **Modern usage-based / metering platforms (closest in philosophy, weakest on trees):** - **Metronome** — events-in, charges-out. The charge-as-event model is exactly what you're building. Attribution via dimensions/tags. No native tree concept; you'd flatten and reconstruct in your own UI. Has good primitives for tiered pricing, commitments, and credits. - **Orb** — similar shape. Good at usage-based and hybrid models. Same tree limitation. - **m3ter** — UK-based, telco-leaning, sophisticated metering. Closer to your world than Metronome — has more nuanced derived-charge support — but still not tree-native. - **Lago** (open source) — modern billing, usage events, plans. Subscription-leaning. Trees aren't a first-class concept. These give you 60–70% of the rating engine. The missing 30% is exactly the tree-as-first-class part, which is your differentiator. You'd end up bending the platform's attribution model to encode trees as tags, then doing tree-aware logic in your own service before sending events. At which point you've built half a rating engine anyway. **Subscription billing (wrong shape entirely):** - Stripe Billing, Recurly, Chargebee, Zuora. - These think in subscription + plan + add-on. No usage trees, no rate context cascade, no first-class derived charges (Zuora has charge models but they're rigidly templated). Useful for the *invoicing and payment* side, useless for *rating*. **Open source billing engines (instructive, not adoptable):** - **Killbill** (Java) — most mature OSS billing. Subscription bundles, catalog with effective dates, plugin architecture, payment routing. Worth reading the schema. Wrong shape for tree-rated network products. - **BillingProject**, **AgileBill**, etc. — older, smaller, mostly subscription-focused. ### What you should buy, not build These are commodity. Don't reinvent. - **Tax calculation** — Avalara, Stripe Tax, TaxJar, Anrok. Tax law changes every quarter in every jurisdiction. Outsource it. Your `DerivedCharges` impl for `Tax` calls Avalara's API with `(jurisdiction, taxable_amount, sku_category)` and stores the response as the charge. Done. - **Payment processing** — Stripe, Adyen. Obviously. - **Invoice rendering / PDF** — Stripe Invoicing or any of dozens of libraries. Render from your charge stream. - **Dunning / collections / retries** — Stripe Smart Retries, or one of the retention-focused tools (Churnkey, Stunning). Don't write retry state machines. - **Currency math** — `ex_money` in Elixir. Don't write money arithmetic. - **Time interval math** — there are libraries for Allen's interval algebra if your slicing logic gets hairy (it shouldn't, but if it does). ### What you have to build The rating pipeline itself — slicing, rate context cascade, base charges per node type, mod layering, derived charges — is your business logic. Specifically: - The fact that products are trees is a domain choice no platform encodes. - The fact that some pricing decisions are functions of usage profiles (Burstable) is custom enough that no platform supports it without escape hatches that defeat the platform. - The grandfathering algorithm with multi-mod layering is sophisticated enough that you've already invested more in it than any platform offers. ### The realistic recommendation You're already on the right path. Build the rating engine; it's your competitive moat anyway. Around it: 1. **Avalara (or equivalent) for tax.** Implement it as your first `DerivedCharges` impl. Prove the protocol on a non-business-critical case. 2. **Stripe for invoicing and payment.** Render invoices from your charge stream into Stripe Invoice Items, let Stripe handle delivery + payment + dunning. 3. **`ex_money`** if you aren't already using it. Don't write money math. 4. **Steal the data model from Killbill's catalog / subscription bundle code.** Specifically how they handle effective-dated catalog entries and bundle rules. You don't need to adopt the engine to learn from the schema. 5. **Watch what Metronome and m3ter ship.** Their public engineering content is the best ongoing source for "how do modern usage-billing systems handle X." The "we'll just buy something" temptation gets stronger as you stare at the complexity. Resist it for rating. Embrace it for everything else. The rule of thumb: **buy the parts that are commodity (tax, payment, currency math, dunning), build the parts that encode your domain model (rating, catalog, lifecycle).** If management pushes hard for "buy something for billing," the conversation to have is: "we can buy invoicing and payment today; for rating, the only options are tier-1 carrier BSS systems that cost more than we'd spend building, and SaaS metering platforms that don't model our products. The right buy decisions are tax + invoicing + payment, which we should make. The rating engine is custom work either way." ## User Do I need to do all time slicing up front? ## Assistant No. The "compute all boundaries up front" framing was a teaching simplification. In practice you slice **per rating unit**, not globally across the whole tree, and most events touch a small subset of units. ### Per-resource slicing The natural unit is a `(resource, ancestor lineage)` pair — each leaf node plus the chain of ancestors that contribute rate context. The boundaries that matter for *that* unit are only: - The resource's own attribute change events - The resource's rate-card transitions - Mods scoped to the resource (matching the predicate) - Ancestor decision changes that propagate down through `rate_context` - Customer-state changes affecting the resource (tier change, etc.) Most events affect a small number of resources. A single mod activation might create a boundary for 200 charges out of 65k. A topology change touches one subtree. There's no point unioning every event in the period for every resource — the global boundary set is the union of all per-resource boundary sets, but you don't need it materialized. ### What this looks like ```elixir defp rate_resource(node, ancestor_ctx_timeline, period, ctx) do boundaries = [period.from, period.to] |> add_node_attribute_boundaries(node, period) |> add_rate_card_boundaries(node, period, ctx) |> add_mod_boundaries(node, period, ctx) |> add_ancestor_decision_boundaries(ancestor_ctx_timeline) |> Enum.uniq() |> Enum.sort() boundaries |> Enum.chunk_every(2, 1, :discard) |> Enum.flat_map(fn [t1, t2] -> slice = %Interval{from: t1, to: t2} effective_ctx = ancestor_ctx_at(ancestor_ctx_timeline, t1) Priceable.base_charges(node, effective_ctx, slice, ctx) end) end ``` Each leaf rates itself with only the boundaries that touch it. A leaf with no events in the period has one slice (the whole period); a leaf with three mid-cycle changes has four slices. Cost is local to the leaf. ### Ancestor cascade The cascade still works, but instead of a single context value it's a **timeline of context values**: ```elixir %{ effective_speed: [ {~U[2026-04-01], 1_000}, # committed {~U[2026-04-12], 10_000} # burst (decision changed mid-cycle) ], burstable_decision: [...] } ``` The burstable parent computes its decision per its own boundaries (or per cycle, broadcast across the period), producing a timeline. Descendants receive that timeline and treat the decision-change timestamps as additional boundaries for their own slicing. A descendant unaffected by the burstable parent's lineage just gets a single-entry timeline and slices only on its own events. This is computed lazily during the top-down walk — you don't precompute the whole tree's context timelines, just the lineage you're currently walking. ### Memoization opportunities - **Sibling sharing.** All children of a burstable parent share the same ancestor context timeline. Compute it once per parent, hand it to each child. - **Mod boundary sets per `(org, node_type, sku)` bucket.** If 500 ports in an org are scoped by the same mod, the mod's boundary contribution is the same for all of them — cache it. - **Rate-card transition timestamps per SKU.** Global to the SKU, not per resource. These caches turn re-rating a whole org from O(resources × events) into something closer to O(resources + events). ### Streaming variant For very long periods or constrained memory, walk forward in time per resource: ```elixir def stream_charges(node, ancestor_ctx_timeline, period, ctx) do Stream.resource( fn -> initial_state(node, ancestor_ctx_timeline, period) end, &emit_next_slice_charges/1, fn _ -> :ok end ) end ``` Emits charges as it crosses each boundary. Useful for backfills (re-rating a year for all customers), where holding all charges in memory is wasteful. You can pipe directly to a writer. ### When eager global slicing is fine For a single org, single billing month, ~65k nodes — eager global slicing is fine. The boundary set is bounded (a few thousand boundaries total across all events), and the computation is sub-second. The simplicity wins. For backfills or all-tenants re-rating, switch to per-resource. For most online billing runs, per-resource is also the right shape because it parallelizes naturally — your per-org GenServer model maps cleanly onto "rate each org's resources independently, each org slices its own boundaries." ### When you must be careful The only place per-resource slicing breaks down is **derived charges that span resources**. The Burstable surcharge depends on the subtree's totals, so it can't be computed until every resource in the subtree has been rated. That's fine — stage 4 runs after stage 3, and stage 3 has finished for the subtree by then. But it does mean the slicing for a derived charge is the *union* of slicing for all the source charges it references. You can't slice the surcharge in isolation; it inherits boundaries from everything it depends on. In code, that's just: derived charges run on the post-mod charge list for the subtree, slice the surcharge whenever any contributing charge has a slice change. Practically, sum-by-slice and emit one surcharge charge per slice. No special algorithm — just respect the dependency. ### The takeaway **Slice per rating unit, not globally. Compute boundaries lazily per resource, share what's shareable across siblings/SKUs, and use the streaming variant when the period or resource set is large enough that materializing all charges is wasteful.** The eager-global approach in the writeup is the simplest model and right for a single billing run; per-resource is the production-scale variant. ## User I'm like one person with some help from a coworker. Is this insance? I'm supposed to deliver in a month. ## Assistant Yes, the full architecture in a month with 1.5 people is insane. That document is the **destination**, not the v1. But — your real deliverable is probably "ship Burstable Internet to customers," not "rewrite the billing engine." Those are different scopes. If you scope the work as "extend the current system to handle this one new product," it's tight but doable. If you scope it as "rebuild the rating engine on the new architecture and migrate every existing product," it's a year of work for a small team. ### What you can ship in a month Stay inside your existing model and accept the debt: 1. **Tree storage.** You said you have this already — keep it. 2. **Burst decision at billing time.** Add a single function: given a resource ID and a billing period, return the effective speed. One usage query, one threshold compare. Stamp the chosen SKU onto each child resource for the period. 3. **Surcharge as a Usage Record.** Attribute it to the parent. Reuse your existing pricing pipeline. This is uglier than the `DerivedCharges` protocol, but it works for one product. 4. **Tests for the burst decision and surcharge math.** This is the part that has to be right. Don't ship without it. 5. **Mid-cycle changes: scope down or punt.** If product can accept "burst decisions are per-cycle, mid-cycle topology changes don't trigger re-evaluation in v1," you avoid the membership-slicing work entirely. Ship that constraint as a known limitation. That's a real one-month plan for one and a half engineers. You're not building protocols. You're not refactoring the mod engine. You're adding one product, end-to-end. ### What you should *not* try to do in this month - Implement `Priceable` / `DerivedCharges` as protocols across all node types - Migrate existing products to the new model - Build the rate context cascade as general infrastructure - Build new presentation transformations - Generalize the four-stage pipeline Each of those is a multi-week effort on its own. Trying to do them in parallel with shipping is how a one-month deadline becomes three months and a half-finished refactor. ### The conversation to have with whoever set the deadline > "I can ship Burstable Internet in a month using our existing engine, with two known constraints: (1) mid-cycle topology changes won't be supported in v1, and (2) we'll be carrying technical debt on this product specifically that we'll need a follow-up quarter to clean up. The cleanup is a refactor toward a model that handles future complex products natively. If we instead build the new model now, the timeline is 4–6 months and we don't ship Burstable in this quarter." That's the honest framing. Most product owners pick option A. Some pick option B. A few realize they actually need option C: ship a subset of Burstable in a month, defer the harder pieces. Either way, the conversation is yours to drive — they don't have the technical context to know what's hard. ### About the architecture doc It's still worth having. Two reasons: - **It anchors the cleanup work.** When the next product comes (and it will, because product owners taste blood after one complex product ships), you have a written destination instead of arguing from scratch about every decision. - **It informs the v1 hacks.** Even when you're cutting corners for the deadline, knowing the right shape lets you cut them in places that are easy to reverse later. The surcharge-as-usage-record hack is fine if you know it'll become a `DerivedCharges` impl in six months. It's a disaster if you forget that and start building more code on top of it. Use the doc as a guardrail, not a v1 spec. A note at the top — "this is the 12-month destination, not what we're shipping" — keeps the team from panicking at the scope. ### Reality check from the cheap seats A 1.5-person team shipping a product like Burstable in a month is normal startup velocity for a single product. Writing the architecture doc *and* shipping the product *and* maintaining the existing system in that month is too much. Pick two of three: - Architecture doc + shipping the product = no time to maintain existing system (acceptable for one month if your existing system is stable) - Architecture doc + maintaining existing = no shipping (not acceptable) - Shipping the product + maintaining existing = no architecture doc (you already wrote it) You wrote the doc. The doc is done. Now spend the month shipping Burstable using the existing engine, with the doc as a north star for where you'll go next quarter. That's not insane. That's a normal small-team plan. ## User Give me a concrete plan for how to ram this thing into the existing arch. ## Assistant Concrete one-month plan. Assumes ~1.5 engineers, existing 1:1 system stays intact, Burstable is bolted on. ## Constraints to negotiate with product *first* Get these locked before week 1, in writing: 1. **Burstable subtrees are stable within a cycle.** Adds/removes take effect next cycle. Block in the order UI. 2. **Burst decision is per-cycle**, not per-slice. One decision per parent per billing period. 3. **Surcharge basis is fixed** (pick `:net` or `:list`, ship one). Don't make it configurable. 4. **Re-rating support is best-effort.** First customer dispute goes to manual review while you instrument. 5. **One burstable parent per Internet/Link.** No nested burstables in v1. These five constraints turn a six-month problem into a one-month problem. Without them, the timeline is impossible. With them, the architecture below works. ## The bolt-on shape Keep the existing model. Add three things: 1. **A `burstable_parent_id` foreign key** on resources that participate in a burstable. 2. **A `BurstDecision` table** populated per (parent, billing_cycle), with the chosen rate and audit data. 3. **A synthetic usage record path** that emits the surcharge as a normal billable line. That's it. No new pipeline, no new protocols, no new mod engine. ## Week 1 — Foundation: tree wiring + burst decision ### Schema changes ```sql -- Resources that belong to a burstable subtree know their parent. ALTER TABLE resources ADD COLUMN burstable_parent_id UUID REFERENCES resources(id); CREATE INDEX ON resources (burstable_parent_id) WHERE burstable_parent_id IS NOT NULL; -- One burst decision per parent per cycle. CREATE TABLE burst_decisions ( id UUID PRIMARY KEY, burstable_parent_id UUID NOT NULL REFERENCES resources(id), cycle_start TIMESTAMPTZ NOT NULL, cycle_end TIMESTAMPTZ NOT NULL, chose TEXT NOT NULL CHECK (chose IN ('committed', 'burst')), pct_above_committed NUMERIC NOT NULL, threshold NUMERIC NOT NULL, committed_speed BIGINT NOT NULL, burst_speed BIGINT NOT NULL, decided_at TIMESTAMPTZ NOT NULL, decided_from_usage_query JSONB NOT NULL, -- audit: what data drove this UNIQUE (burstable_parent_id, cycle_start, cycle_end) ); ``` The `decided_from_usage_query` JSONB is critical: it captures the inputs (sample window, query parameters, telemetry source) so a re-evaluation is reproducible and a dispute is answerable. ### Burst decision module ```elixir defmodule Burst.Decision do @threshold 0.05 # 5% def for_cycle(parent, %{cycle_start: cs, cycle_end: ce} = cycle) do case Repo.get_by(BurstDecision, burstable_parent_id: parent.id, cycle_start: cs, cycle_end: ce) do %BurstDecision{} = existing -> existing nil -> compute_and_store(parent, cycle) end end defp compute_and_store(parent, cycle) do {pct, query_audit} = Usage.fraction_above(parent.id, cycle, parent.committed_speed) chose = if pct > @threshold, do: :burst, else: :committed %BurstDecision{ burstable_parent_id: parent.id, cycle_start: cycle.cycle_start, cycle_end: cycle.cycle_end, chose: chose, pct_above_committed: pct, threshold: @threshold, committed_speed: parent.committed_speed, burst_speed: parent.burst_speed, decided_at: DateTime.utc_now(), decided_from_usage_query: query_audit } |> Repo.insert!() end def effective_speed(%BurstDecision{chose: :burst, burst_speed: s}), do: s def effective_speed(%BurstDecision{chose: :committed, committed_speed: s}), do: s end ``` ### SKU lookup integration point Find the place in your existing pipeline where a usage record's attributes get mapped to a SKU. Wedge in a single check: ```elixir def resolve_sku(usage_record, cycle) do attrs = case usage_record.burstable_parent_id do nil -> usage_record.attrs parent_id -> parent = Resources.get!(parent_id) decision = Burst.Decision.for_cycle(parent, cycle) Map.put(usage_record.attrs, :speed, Burst.Decision.effective_speed(decision)) end SKU.lookup(attrs) end ``` This is the entire integration on the lookup side. Three lines in the conditional. Everything else in your pipeline stays unchanged. ### Week 1 deliverables - Migrations applied (dev + staging) - `Burst.Decision` module with unit tests for the threshold logic - `Usage.fraction_above/3` adapter to your telemetry system (might be the hardest part — confirm by EOW1 or escalate) - SKU resolution wired through the burst check - Integration test: a fake burstable resource with mocked usage data produces the right SKU **Hidden risk**: usage telemetry availability. If `Usage.fraction_above/3` isn't trivially implementable from your existing time-series data, this turns into a 2-week project on its own. **Validate this on day 1.** ## Week 2 — Surcharge as a synthetic usage record ### The pattern After child resources have been priced for a cycle, compute the burstable parent's surcharge and emit it as a usage record attributed to the parent. Your existing invoice generation picks it up like any other line. ### Schema ```sql -- New SKU. Price is computed at emission time, not from a rate card. INSERT INTO skus (code, description, price_resolution) VALUES ('BURST_SURCHARGE', 'Burstable Internet surcharge', 'parametric'); ``` The `price_resolution: 'parametric'` is a flag that tells your pricing pipeline "the price is on the usage record itself, don't look up a rate." If you don't have this concept, add it — it's a small generalization that unlocks a lot of one-off charges later. ### Surcharge emitter ```elixir defmodule Burst.Surcharge do @rate 0.10 # 10% — make this a SKU attribute or config later def emit_for_cycle(parent, cycle) do subtree_total = Pricing.subtree_total_for_cycle(parent.id, cycle) surcharge = Money.multiply(subtree_total, @rate) %UsageRecord{ resource_id: parent.id, sku_override: "BURST_SURCHARGE", cycle_start: cycle.cycle_start, cycle_end: cycle.cycle_end, explicit_price: surcharge, attrs: %{ derivation: %{ basis: subtree_total, rate: @rate, source_resource_ids: Pricing.subtree_resource_ids(parent.id) } }, synthetic: true } |> Repo.insert!() end end ``` The `synthetic: true` flag is important — it tells downstream code (audit, customer-facing detail) that this didn't come from a real lifecycle event. It also lets you idempotently regenerate it on re-rating without conflicting with audit-log-derived records. ### Where to call it Hook it into the per-cycle billing run, after children have been processed: ```elixir def run_billing_cycle(org_id, cycle) do # ... existing per-resource processing ... # New: emit surcharges for any burstable parents in this org. Resources.burstable_parents_for_org(org_id) |> Enum.each(&Burst.Surcharge.emit_for_cycle(&1, cycle)) # ... existing invoice generation continues, picks up the synthetic records ... end ``` ### `Pricing.subtree_total_for_cycle/2` This is the new query you need. Get the post-mod total of all charges attributable to descendants of a given parent in a given cycle. If your existing pricing produces an intermediate "rated usage records for the cycle" table, it's a sum-where query. If not, run the existing pricing function to get amounts and sum. Cache the result if you call it more than once per parent per cycle. ### Week 2 deliverables - `BURST_SURCHARGE` SKU + parametric pricing path - `Burst.Surcharge.emit_for_cycle/2` with tests - `Pricing.subtree_total_for_cycle/2` query - Integration test: full burstable subtree with children, surcharge appears on invoice with correct math - Re-emission idempotency: running the cycle twice produces one surcharge, not two (delete-and-reinsert or upsert by `(parent_id, cycle, synthetic: true)`) ## Week 3 — Mods, audit, edge cases ### Mod scoping Existing mods that scope by SKU pattern (e.g., `~r/^INTERNET_/`) will already match the chosen-speed SKU correctly because resolution happens before mod application. Verify with tests for both branches: - Customer with reseller discount + burstable subtree, burst chose committed: discount applies to committed-speed Internet SKU. - Same customer, burst chose burst: discount applies to burst-speed Internet SKU. - Same customer, mod that explicitly excludes burstable SKUs (you might need this for product reasons): scope predicate handles it. ### Mods on the surcharge Decide which existing mod types should apply to the surcharge: - Reseller discounts → probably yes (it's a service charge) - Promo codes → probably no (don't accidentally make burst free) - Admin overrides → yes - Tax → yes (handle via your existing tax flow if you have one) For each, either confirm the existing mod's scope predicate excludes/includes BURST_SURCHARGE correctly, or update the predicate. Document the decisions in the runbook. ### Audit / explainability The thing customers will dispute is "why am I being billed at burst rate?" You need an answer in the invoice or in support tooling: ```elixir def explain_charge(usage_record) do case usage_record.burstable_parent_id do nil -> nil parent_id -> decision = Burst.Decision.for_cycle(...) """ Rated at #{decision.chose} speed. Usage was above committed (#{decision.committed_speed}) for #{Float.round(decision.pct_above_committed * 100, 2)}% of the cycle. Threshold for burst billing: #{decision.threshold * 100}%. """ end end ``` Wire this into your invoice detail view. For the surcharge: ```elixir def explain_surcharge(usage_record) do audit = usage_record.attrs.derivation """ Burstable Internet surcharge: #{audit.rate * 100}% of #{Money.to_string(audit.basis)}. Applied to #{length(audit.source_resource_ids)} resources in the burstable subtree. """ end ``` ### Edge cases that must be handled - **No usage data for the cycle.** Default to `:committed`, log a warning, alert ops. Don't bill burst on missing data — that's a guaranteed dispute. - **Empty subtree.** Surcharge is $0 — emit it anyway with the audit, or skip emission entirely. Pick one and document. - **Burstable parent created mid-cycle.** Prorate the burst decision window? Easier: only consider the portion of the cycle the parent existed; decision applies only to that portion. Document. - **Burstable parent deleted mid-cycle.** Same: decision and surcharge apply only to the lived portion. - **Mod applied/expired mid-cycle on a child.** Existing mod engine handles this within the resource's slice. Burst decision is unaffected. - **Re-running billing on the same cycle.** Burst decision is read from the table, not recomputed (deterministic). Surcharge is regenerated idempotently. ### Week 3 deliverables - All edge cases tested - Mod scope verified for each existing mod type - Audit / explain functions integrated into invoice detail view - Operational alerts for missing-usage-data case - Runbook for support: how to answer "why am I being billed at burst" ## Week 4 — Hardening, monitoring, ship ### Feature flag rollout ```elixir if FeatureFlag.enabled?(:burstable_internet, org_id) do # new code path end ``` Roll out: 1. Internal test org first. 2. One pilot customer who knows what they're testing. 3. Watch the first invoice end-to-end with that customer. 4. Expand. ### Monitoring - Count of burst decisions made per cycle, broken down by `chose: :burst` vs `:committed`. - Distribution of `pct_above_committed`. If this clusters near 0% or near 100%, your threshold or telemetry is suspect. - Surcharge total per cycle, per org. Big swings indicate something changed. - Missing usage data alerts (page if any burstable parent has no data at billing time). ### Re-rating safety net Even if full re-rating is "best effort," you need *one* manual recovery path: - Operator can mark a `BurstDecision` as superseded, providing a corrected one with new audit data. - Operator can regenerate the surcharge for a cycle. - Both produce an audit log entry pointing at the original. This is 50 lines of code and a one-page runbook. Without it, the first dispute becomes an emergency. ### Final pre-ship checklist - [ ] Pilot customer's first invoice manually reconciled by hand - [ ] All audit fields populated and human-readable - [ ] Re-running the cycle produces identical results - [ ] Feature flag works in both states (enabled / disabled) - [ ] Docs for support: how to answer customer questions about burst billing - [ ] Monitoring dashboards live - [ ] Manual decision-override tooling tested - [ ] One real internal end-to-end test: place an order for a burstable, simulate usage, generate invoice, verify ## Defer list (explicit, written down) Hand this to product so it's clear what isn't in v1: - Mid-cycle topology changes for burstable subtrees (effective next cycle only) - Sub-cycle burst decisions (always per-cycle in v1) - Configurable surcharge calculation basis per customer (one global setting) - Nested burstable products (one level only) - Self-service modification of burstable rules - Automated re-rating on usage corrections (manual workflow only) - Tree-grouped invoice rendering for burstable products (flat invoice only — surcharge appears as its own line) Each of these is a real follow-up. The architecture doc is the destination; this list is what gets triaged into next quarter. ## Total scope check Code: maybe 800–1500 lines including tests. The integration points are small because you're hooking into existing pipelines, not replacing them. Effort distribution roughly: - 25% telemetry integration (depends entirely on what exists) - 20% surcharge math + idempotency - 20% audit / explain plumbing - 20% edge cases + mod scope verification - 15% rollout, monitoring, runbook A 1.5-person team can absolutely ship this in four weeks **if** week 1 confirms telemetry availability and product agrees to the five constraints up front. If either of those falls through, the timeline slips and that's the conversation to have immediately, not in week 3. Most of the risk is non-technical: product saying "no, we have to support mid-cycle" or telemetry saying "we don't keep that granularity." Validate both in the first 48 hours. ## User Man it would be nice to integrate with Lago ## Assistant Lago is actually a strong candidate for the role you should be buying — **invoicing, payment, tax, customer portal, dunning** — while you keep building the rating engine. The integration boundary is cleaner than people expect because Lago supports a "dynamic" charge model where you push the computed amount and Lago doesn't try to price for you. ## Where Lago fits The right integration shape: ``` Your rating engine ──→ Lago (events with dynamic prices) │ ├─→ Invoice generation ├─→ Tax (native rates or Avalara) ├─→ Payment (Stripe/Adyen/etc.) ├─→ Customer portal └─→ Dunning + retry logic ``` You compute every `Charge` (base, mods, derived). For each, you POST a Lago event: ```json { "transaction_id": "ch_internet_42", "external_subscription_id": "sub_org_xyz", "code": "INTERNET_10G_MONTHLY", "timestamp": "2026-04-01T00:00:00Z", "properties": { "amount_cents": 8000, "tree_id": "link/123", "node_path": "z_side.internet", "burstable_decision": "burst" } } ``` The charge code is mapped to a Lago `BillableMetric` with charge model `dynamic` — meaning Lago uses the `amount_cents` you provided rather than computing one. Lago groups events by subscription + billing period, builds the invoice, runs tax, sends to payment. ## What you stop building - Invoice rendering and PDFs - Payment provider integration and retries - Dunning state machines - Customer billing portal - Tax calculation (Lago has Avalara integration; native rates for simpler jurisdictions) - Webhook handling for payment events That's a *lot*. Probably 6+ months of work you don't do. ## What you keep building - The rating engine (every architectural piece in the doc) - Your mod layering algorithm (Lago's coupons can't express what yours does) - Tree-aware audit and explain views (Lago's portal won't show tree structure) - Re-rating workflow (Lago can issue credits, but the *decision* of what credit to issue is yours) ## Where the model gets awkward - **Subscriptions as the central concept.** Lago organizes everything around a Subscription. Your model doesn't really have subscriptions in the SaaS sense — customers have inventory of resources. You'd map each customer to one or a few "subscriptions" that act as billing buckets, then attach all your dynamic-priced events to them. Workable but feels a little forced. - **Coupons / discounts.** Don't use Lago's coupon system for your mods. Your mods are too sophisticated. Push mod outcomes as separate events with negative amounts (which is how the architecture doc has them anyway). Lago will sum them on the invoice correctly. - **Re-rating.** Lago invoices are immutable once finalized. To correct, you issue a credit note via Lago's API. This matches the "never mutate, always emit adjustments" principle but you have to wire your re-rate flow to call Lago's credit-note endpoint. - **Tree-aware presentation.** Lago's customer portal shows a flat invoice. If you want customers to see charges grouped by tree position, you build that view yourself (read from Lago, render your way) or accept that the portal is flat and your support tooling is where the tree view lives. - **Dynamic charges have weaker built-in analytics.** Lago's dashboards assume known unit prices. With dynamic, you provide amounts directly, so per-unit metrics aren't meaningful. Live with it. ## Self-host vs cloud Lago is open source (AGPL for the core, MIT for some components — check current licensing). You can self-host on Postgres + Redis + Sidekiq workers. Real but not crazy. The **cloud** offering is significantly easier and the team behind it ships fast. For your scale (a few thousand orgs, tens of thousands of resources), cloud pricing is probably reasonable. If you have any compliance posture that requires data residency or you've been burned by SaaS billing vendors before (Recurly, Chargebee acquisitions, etc.), self-host. Otherwise cloud. ## Honest take on timing **Don't integrate Lago in this month.** It's a parallel track to shipping Burstable — adding Lago integration to the deadline doubles the scope. Ship Burstable into your existing invoice generator, then pick up Lago as a Q2 or Q3 project. The order I'd recommend: 1. **Now (this month):** Ship Burstable. Existing invoice generation. Existing payment flow. 2. **Next quarter:** Pilot Lago for new customers only. Existing customers stay on the current invoice flow until you've validated the integration on real money. 3. **Quarter after:** Migrate existing customers in cohorts. The rating engine refactor (toward the destination architecture in the doc) happens in parallel — both the new invoice system and the new rating engine consume the same `Charge` data model, so they evolve together. 4. **Decommission the old invoice generator.** Probably 2–3 quarters out. This sequencing gives you the architectural cleanup (Charges as first-class) as a *forcing function*: the Lago integration is a clean motivation to make Charges the canonical unit, because Lago events are basically Charges-shaped already. You get the destination architecture as a side effect of the integration work, instead of as a standalone refactor that has to justify itself. ## The diligence to do before committing If you want to seriously consider it, spend a day on: - **Read their API docs**, especially events, subscriptions, dynamic charges, credit notes, and the addon flow. Write down which of your Charge fields don't have a clean home. - **Spin up cloud Lago for free** and push a few synthetic events with dynamic pricing. See if the invoice that comes out is shaped like what you want. - **Check their changelog and GitHub issues** for the last 6 months. How fast do they ship? Do customers complain about breaking changes? How active is the community? - **Talk to someone using it in production at scale.** Their public Slack or Discord usually has people willing to share. You want to know about the failure modes, not the happy path. A day of investigation will tell you more than weeks of speculation. If the answers are positive, build the integration spike (push 100 events end-to-end and verify the invoice) before committing to a migration timeline. The short version: **yes, integrate Lago, but not this month.** It's a real win for the parts that aren't your differentiator, with a clean enough boundary that the rating engine stays yours. ## User So we do reporting. Invoicing, payment,s tax, etc... is all upstream of us. Not our department. We generate reports that are basically invoices and that's the end. The test issn't our porblem. So I couldn't get much? ## Assistant Right, that changes the picture significantly. **In your scope — rating + report generation, with everything downstream handled by other teams — Lago offers very little.** The parts where it shines (invoicing, payment, tax, dunning, customer portal) are explicitly not yours. What you actually do is the **most custom part of the billing stack.** Off-the-shelf tools live downstream of you. The reporting/rating layer is where business logic concentrates and where every company ends up writing custom code, because it's the part that encodes "what does our pricing actually mean." ## What Lago would give you in your scope - **Event storage**: not really useful — you have your own data model and audit log already. - **Aggregation primitives**: their aggregation models (sum, count, max, unique) don't capture what you do (rate context cascade, derived charges, time-sliced mod layering). You'd push dynamic-priced events that bypass aggregation entirely, which means you're using Lago as an expensive event database. - **Subscription/billing period concept**: you'd have to map your model into theirs, which is friction without payoff if you don't use the downstream features. - **Their OSS code as reference**: useful, but you can read it without integrating. The leverage just isn't there. ## What's actually useful in your scope Forget platforms. Look for **components**: - **Time-series storage for usage telemetry** (TimescaleDB, ClickHouse, InfluxDB) — if your usage data lives in raw Postgres tables today, this is a real upgrade as data grows. Burstable's `fraction_above_committed` query gets cheap. - **Money math library** — `ex_money` (Elixir), `Money` (Ruby), `decimal.js`. Don't write your own. - **Rules engine or predicate DSL library** — for mod scoping. `Drools`-style libraries exist in most languages; in Elixir you'd probably build it yourself with macros (small effort, big readability win). - **Workflow / job orchestration** — Oban (Elixir) for billing run orchestration, retries, scheduling. If you're not already using something like this, the manual cron + lock approach to monthly runs gets painful. - **Audit log infrastructure** — you have this already given the system you're processing. - **Schema migration discipline** — Ecto handles this; just call out that as Charge becomes first-class, schema evolution gets more important. None of these are billing-specific. They're general-purpose components that happen to be load-bearing for your work. ## The reference systems are still worth reading Even if you can't *integrate* with anything, the reference value of these is real: - **Killbill**'s catalog and subscription bundle code (Java, OSS): how they handle effective-dated catalogs and price plans. Read for the schema. - **Metronome / Orb engineering blogs**: how they model usage events and aggregation pipelines. Closer to your shape than Lago. - **AWS CUR schema**: the gold standard for "flat charges with rich attribution." - **Stripe Billing's invoice item model**: cleanest shape for "queued line that lands on next invoice." Even if you're upstream of invoicing, the data model is instructive. You're not going to find a platform that does what you do. Your team's job is to be the canonical place where pricing logic lives, and that's inherently bespoke. The right play is to **build the rating engine well, expose its output as a clean data contract to downstream systems, and use general-purpose components for the non-domain pieces.** Honestly: if I were in your role, I'd stop looking for off-the-shelf rating tools, accept that this is in-house work, and focus the "what should I buy" energy on developer velocity tooling — better observability into rating runs, better testing infrastructure, better tooling for product/finance to see how a hypothetical change would affect bills before shipping. Those investments compound and let your small team punch above its weight, which is the actual leverage problem you have. ## User I was just looking at Metronome. Really similar but no tree concept. We've gotta be closetst to amazon, nested products. product trees. That's the main difference I';m seeing. ## Assistant You're right that you're closer to AWS than to Metronome, but worth pushing back gently: **AWS doesn't actually do tree-aware *rating* either.** They do flat rating with rich attribution, and the tree exists in presentation/grouping. Knowing this might let off-the-shelf tools fit better than you think — or at least let you be more precise about which parts are actually unique to you. ## What AWS billing actually does The tree exists in product configuration and customer-facing structure, not in rating. Look at how AWS bills: - An EC2 instance bills as: instance-hours + EBS hours + network-out + snapshot storage. **Each line is independently emitted** by the responsible service. There's no "EC2 instance" parent line that contains its children. - An EKS cluster bill is the sum of cluster control-plane hours + every child resource's lines. The cluster itself emits its own line; nothing in billing knows that the EC2 instances "belong to" the cluster except via tags. - VPCs contain subnets contain ENIs in CloudFormation/Config, but in billing they're flat lines with `resource_id`, `parent_resource_id`, and tag columns. - Reserved Instances and Savings Plans (the closest analog to your modifiers) are *overlays* — a separate matching pass that assigns discount hours to running resources after the fact. Each instance still emits its own line; the RI emits its own credit line. The CUR (Cost and Usage Report) has 100+ columns of attribution but **zero parent-child structure**. You reconstruct hierarchy in QuickSight or wherever else by joining on `resource_id`. ## What this means for you Your "tree" matters for four things: 1. **Customer-facing structure** — what they ordered, how it's connected 2. **Lifecycle orchestration** — your audit log → state model 3. **The burst decision** — parent has the rule 4. **The surcharge** — parent rolls up children For rating itself, only #3 and #4 actually need tree-awareness, and both can be implemented as **pre-rating** (decision computed once and stamped onto children's billing context) and **post-rating** (derived charge walks parent's children list). In other words: the rating engine could be flat per-resource, with two small tree-aware passes around it. That's basically the AWS model. The deeply nested case you described (BurstableInternet → Internet → NetworkSpace → ...) flattens out because: - Internet, Link, NetworkSpace each rate themselves from their own properties + an injected `effective_speed`. - Burstable parent is a separate flat charge for the surcharge, computed from the sum of its children's already-flat charges. - Mods scope to flat charges by predicate (already in your design). The tree exists in product config and audit / explain views. It doesn't have to exist in the rating loop. ## Why this matters for tool fit If you reframe rating as flat-with-pre-and-post-passes, **Metronome (and m3ter, and Orb) get closer to fitting**. The integration shape becomes: ``` Your rate context resolver walks the tree → stamps effective_speed on each resource → emits one event per resource Metronome does aggregation + invoice Your derived-charge pass emits surcharge events ``` You're using their event ingestion + aggregation + invoice pipeline; you're keeping your tree logic upstream. The tree never enters the platform. But — and this brings you back to the same conclusion — **in your scope (rating + reporting, not invoicing), you don't need their invoice pipeline.** So you're using them for event storage and aggregation, which you can do natively. The tool fit improves, but the value-add for *your team* still doesn't. ## Where you actually are unique If you list what you're building that nobody else has packaged: - **Mod layering with grandfathered + dynamic semantics across multiple types** — Metronome has discounts and credits, but your layering algebra is more sophisticated (multiple priority buckets, commutative vs non-commutative, floors/ceilings, stop conditions). - **Effective-dated rate context cascade** — you derive context from a tree at billing time. Nobody else models this; it's specific to network products with rule-holding parents. - **Tree-aware audit and explain** — "this charge came from this node which inherited burst from this parent which decided from this usage profile." This level of attribution exists in some BSS systems but not in modern usage-billing platforms. These three are your differentiation. Everything else (event ingestion, aggregation primitives, basic discount math, invoice rendering, payment) is commodity, but mostly in the layers downstream of you. ## Honest reframe You're not unique because you have trees. You're unique because: 1. **Your modifier algebra is non-trivial** in ways no platform models. 2. **Your products embed pricing rules** (burstable parent's burst-decision rule), not just price points. 3. **Your audit requirements** demand explainable derivations across hierarchies. The tree is a symptom of #2 and #3, not the root distinction. Reframing this might help with internal sell — "we're not building a billing platform, we're building a rating engine that handles rules-bearing products" sounds less ambitious than "we're building tree-aware billing" but it's more accurate and easier to defend in a roadmap conversation. It also unlocks the right tool conversation: **for the rating engine itself, in-house. For event storage and time-series queries, look at ClickHouse or Timescale. For downstream invoicing/payment/tax, that's another team's problem and they should look at Lago/Metronome/Stripe.** Three different scopes, three different decisions. Closer to AWS than Metronome is right. Closer to AWS than to *anything* commercial is also right. AWS Billing is built by hundreds of engineers and is famously gnarly. You're a 1.5-person team trying to deliver a smaller-scope version of the same problem, which is hard but not insane — as long as you keep the scope honest about which parts are domain (rating logic) versus commodity (everything else). ## User Why do I have the tree? For rating rules that rely on the relationships. Otherwise, it's all just flat stuff. How do I reconcile that? ## Assistant Right reconciliation: **the tree is a queryable input to rating, not the structure of the rating engine.** The engine stays flat. Specific rules consult the tree when they need to. Most rules don't. This is the same shape as an org chart in HR systems. The org chart is a real, important data structure. Payroll runs flat per-employee. Approval workflows query the org chart. Most other things ignore it. Nobody says "payroll is hierarchical" because payroll *output* isn't — it's flat per-employee with a `manager_id` field for any rule that cares. Apply that lens to your system: ``` Tree ← domain object, lives in your data model ├─ Used by orchestration (lifecycle, validation, UI) ├─ Used by some rating rules (rate context, derived charges) └─ Used by presentation (group charges by node path) Rating engine ← flat per-resource ├─ Resource gets a "billing context" stamped on it pre-rating ├─ Resource rates itself from its own attrs + context └─ Output: flat Charge stream A few special rules ← tree-aware bookends ├─ Pre-rating: walk rule-bearing subtrees, stamp context on descendants └─ Post-rating: derived charges per rule-bearing parent (surcharges, rollups) ``` The vast majority of your rating logic doesn't touch the tree. Internet rates from `effective_speed`. NetworkSpace rates from `size`. Link rates from `speed`. None of those care about the tree — they care about their own attributes plus whatever context was stamped on them. The tree-aware code is a thin shell around a flat core: ```elixir def rate_org(org, period) do # Pre-pass: resolve context for rule-bearing subtrees. contexts = resolve_rate_contexts(org, period) # ^ walks BurstableInternet parents, computes decisions, returns # %{resource_id => effective_context} for affected descendants # Flat rating: every resource rates itself. base_charges = org |> Resources.billable_for_period(period) |> Enum.flat_map(fn r -> ctx = Map.get(contexts, r.id, %{}) Priceable.base_charges(r, ctx, period) end) # Mod layering: flat per charge. mod_charges = apply_mods(base_charges, org, period) # Post-pass: derived charges per rule-bearing parent. derived = emit_derived_charges(org, base_charges ++ mod_charges, period) # Mods on derived. Same engine. derived_mods = apply_mods(derived, org, period) base_charges ++ mod_charges ++ derived ++ derived_mods end ``` That's the whole rating engine. The tree-aware passes — `resolve_rate_contexts` and `emit_derived_charges` — are isolated. `Priceable.base_charges` doesn't get a `parent_ctx` parameter; it gets a flat `ctx` map that was already resolved. This is a meaningful simplification over the protocol-cascade version in the architecture doc. The doc described it as the engine walking the tree top-down with context propagation. You can flatten that into a pre-pass that just produces a `resource_id → context` map. Same result, much simpler to implement and test, and most node types don't need to know the cascade exists. ## How this reconciles with what you said > "I have the tree for rating rules that rely on relationships. Otherwise, it's all just flat stuff." Exactly. The tree exists because *some* rules need to express relationships: - "This child's effective speed depends on a parent's decision" (Burstable) - "This parent's surcharge depends on its descendants' totals" (Burstable surcharge) - "Discounts on this subtree are reseller-specific" (mod with `tree_path_prefix` scope) - "This bundle's total replaces the sum of its children" (future bundle pricing) For each of these, a small tree-aware pass implements the rule. The pass stamps something onto resources (a context, a flag) or emits something attributable to a parent (a derived charge). Everything else stays flat. The tree is **the addressable substrate the rules query.** It's not the substrate that contains the charges. ## Implication for your roadmap This reframing makes the v1 even more tractable: - The "tree-aware engine" you were imagining is actually one preprocessing function (`resolve_rate_contexts`) and one postprocessing function (`emit_derived_charges`) wrapped around your existing flat pipeline. - Each function is ~50 lines for one product type (Burstable). Adding new tree-rule-bearing products adds a new clause to each. - Existing flat products don't change. Their `base_charges` impl is unchanged. They just receive an empty context map and emit as they always did. You're not redesigning the engine. You're adding two small tree-aware passes that 95% of your code can ignore. ## The principle restated > **The tree is data. Rating is flat. A small number of rules read the tree to set up or follow up the flat rating pass. Output is always a flat charge stream.** This is also what AWS does, what Metronome would do if it grew tree-aware features, and what every billing system that scales eventually converges on. The systems that try to make rating *itself* hierarchical (some old telco BSS designs) end up with engines that are hard to reason about and slow. The systems that keep rating flat and treat hierarchy as queryable input scale. You have the tree because your *product domain* requires it. Your *rating engine* doesn't have to be tree-shaped to serve a tree-shaped product. ## User So for a flat billing item, it's context is the tree. Can be above or below or just from above? ## Assistant Good question. The cleanest way to think about it: **Context flows down (from above) at rating time. Information flows up (from below) at derived-charge time. No single charge needs both directions.** This is a strong property. For any individual charge: - **A child's charge** consults context that came *from above* (parent's burst decision, ancestor's bundle membership, location's tax jurisdiction). It never queries below itself. - **A parent's derived charge** (surcharge, rollup) consults information *from below* (descendants' already-computed charges). It never queries above itself. - **A flat resource's charge** with no relational rules consults nothing tree-shaped — just its own attrs. These are *different charges* in different stages of the pipeline. The same charge never needs to look both up and down. ## The two-direction split, stage-by-stage ``` [Pre-rating: top-down] Walk parents → stamp context on descendants │ ▼ [Rating: flat] Each resource rates itself from attrs + stamped context │ (No tree queries here. None.) ▼ [Post-rating: bottom-up] Walk parents → emit derived charges from descendant totals ``` Each stage has a single direction. Stage 1 reads down, writes context. Stage 2 reads only itself. Stage 3 reads down (a parent reading its children's outputs) and writes new charges attributed to the parent. The reason this works: anything that *feels* like bidirectional context at rating time can be decomposed into a pre-pass (above) and a post-pass (below). The middle stays flat. ## What about lateral references? (A/Z sides on Links) A Link's A-side and Z-side aren't ancestors — they're cross-references to other resources. Treat them the same as ancestor context: **resolved into the resource's billing context during the pre-pass.** ```elixir def resolve_billing_context(link, period, ctx) do %{ a_side_type: Resources.type(link.a_side_id), z_side_type: Resources.type(link.z_side_id), z_side_location: Resources.location(link.z_side_id), # ...whatever the rating needs to know about A and Z } end ``` By the time `Link.base_charges(link, billing_context, period)` runs, the context map already contains everything the rule needs. Rating itself doesn't traverse references; the resolver did that work upstream. Same shape as ancestor context. This collapses three different "tree shapes" — parent-child, A/Z references, and external lookups (usage telemetry, customer tier) — into one uniform mechanism: **the pre-pass produces a `resource_id → context` map, and rating reads from it.** ## Edge cases worth knowing A few patterns that *seem* to need bidirectional flow but actually don't: - **"Discount the parent based on number of children"** → post-pass. Parent's derived-charge impl counts children. - **"Discount each child if there are 5+ children of a parent"** → pre-pass. Parent's rate-context resolver counts its children once and stamps a discount flag on each child's context. - **"Bundle: if all 3 components are present, charge bundle price; else charge individually"** → pre-pass. Bundle parent checks completeness, stamps `in_bundle: true` (or `in_bundle: false`) on each child. Children's `base_charges` checks the flag; complete-bundle children emit nothing, parent emits the bundle price as a flat charge. - **"Cross-tree volume tier"** (total ports across all customer's services determines per-port rate) → pre-pass at the *customer* level. Customer's resolver counts all ports, stamps the tier on every port's context. Not a tree relationship — a customer-level aggregate. In every case, the pattern is: **resolve once, stamp, then rate flat.** The expensive walk happens at most twice per cycle (once down, once up), not per-charge. ## When this breaks Rare cases where the model strains: - **Mutually recursive pricing** — "A's price depends on B's price; B's price depends on A's price." If this ever shows up in a real product, you have a fixed-point problem and need to solve iteratively. In practice, telco/network products almost never do this; there's always a directional dependency you can name. Treat it as a product-design red flag if it comes up. - **Truly dynamic ad-hoc cross-charge dependencies** — "this charge's rate depends on this completely unrelated other charge's outcome." Usually means the model has the wrong abstractions; revisit. For everything realistic in your domain, **single-direction context per charge** holds. It's the property that makes the pipeline tractable to implement, parallelize, and reason about. ## Compact restatement > Each resource is rated flat from `(its attrs, a stamped context)`. The context comes from a pre-pass that walked above and sideways. Derived charges emerge from a post-pass that walks below. No charge ever needs to peek both up and down at the same time. ## User How can I get a complete copy of this transcript? ## Assistant A few options, in rough order of effort: **Easiest — Claude Code's own transcript files.** Claude Code stores every conversation as JSONL on disk. Look in `~/.claude/projects/-Users-peter/` (or similar — the directory mirrors your working directory path). Each session is a `.jsonl` file. The most recent one is this conversation. You can read it directly: ```bash ls -lt ~/.claude/projects/-Users-peter/*.jsonl | head ``` The format is one JSON object per line, with full message content including tool calls and results. Raw but complete. **Medium effort — convert the JSONL to readable markdown.** A small script that reads the JSONL and emits a markdown transcript with role headers, code blocks for tool calls, etc. I can write it for you if you want — give me 5 minutes. Output goes to a file, you can gist it from there. **In-session escape hatch.** Use `/export` if your version of Claude Code supports it (newer feature, exports the current conversation). Run it and check the output path. **Manual scrollback.** Your terminal scrollback has it as rendered text. Less complete (no raw tool-call args, no thinking) but human-readable. `Cmd+A` in iTerm/Terminal, paste somewhere. Want me to write the JSONL → markdown script and run it on this session? That's probably the cleanest result. ## User Oh my ls is eza aliased ## Assistant Right — `command ls -lt ...` or `\ls -lt ...` bypasses the alias. Or skip `ls` entirely; let me just locate the file directly. ## Assistant **Tool call: `Bash`** ```json { "command": "command ls -lt ~/.claude/projects/-Users-peter/*.jsonl 2>/dev/null | head -5", "description": "Find recent transcript JSONL files" } ``` ## User **Tool result:** ``` -rw-------@ 1 peter staff 845976 May 1 10:08 /Users/peter/.claude/projects/-Users-peter/407f5b26-b24a-4350-8b16-308c14fb3360.jsonl -rw-------@ 1 peter staff 67610 Apr 30 22:28 /Users/peter/.claude/projects/-Users-peter/a3953e32-6b2d-430e-939b-c197696fd0c5.jsonl -rw-------@ 1 peter staff 51599 Apr 30 21:11 /Users/peter/.claude/projects/-Users-peter/cbd1bf9c-6c1d-4b66-9e2d-ceb58620fba1.jsonl -rw-------@ 1 peter staff 5474637 Apr 30 20:25 /Users/peter/.claude/projects/-Users-peter/309da393-3a65-4583-a3f1-cb4fe1659dc0.jsonl -rw-------@ 1 peter staff 1181701 Apr 30 09:58 /Users/peter/.claude/projects/-Users-peter/36d8d174-dc1f-4fe4-aa06-84ad82a04277.jsonl ``` ## Assistant Top one (`407f5b26...`, ~846 KB, May 1 10:08) is this session. Want me to write a JSONL → markdown converter and run it? I'd produce a clean transcript with role headers, code blocks for tool calls/results, and skip the noise (system reminders, internal-only fields). Then you can read it locally or gist it. If yes, any preferences: - Include tool calls and results, or just the user/assistant text? - Include thinking blocks if any are present? - Output path — `~/transcript.md` fine? ## User Input and output is fine. Please convert the whole thing to markdown. Then save it somewhere and tell me how big it is. Then we'll probably upload to pb-gist ## Assistant **Tool call: `Bash`** ```json { "command": "head -3 /Users/peter/.claude/projects/-Users-peter/407f5b26-b24a-4350-8b16-308c14fb3360.jsonl | python3 -c \"import sys, json; [print(json.dumps(json.loads(line), indent=2)[:800]) for line in sys.stdin]\"", "description": "Peek at JSONL structure" } ``` ## User **Tool result:** ``` { "type": "permission-mode", "permissionMode": "default", "sessionId": "407f5b26-b24a-4350-8b16-308c14fb3360" } { "parentUuid": null, "isSidechain": false, "attachment": { "type": "hook_success", "hookName": "SessionStart:startup", "toolUseID": "ee450a79-790c-43e2-8f66-1444b075756c", "hookEvent": "SessionStart", "content": "", "stdout": "{\n \"hookSpecificOutput\": {\n \"hookEventName\": \"SessionStart\",\n \"additionalContext\": \"\\nYou have superpowers.\\n\\n**Below is the full content of your 'superpowers:using-superpowers' skill - your introduction to using skills. For all other skills, use the 'Skill' tool:**\\n\\n---\\nname: using-superpowers\\ndescription: Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions\\n---\\n\\n\nYou have superpowers.\n\n**Below is the full content of your 'superpowers:using-superpowers' skill - your introduction to using skills. For all other skills, use the 'Skill' tool:**\n\n---\nname: using-superpowers\ndescription: Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions\n---\n\n\nIf you were dispatched as a subagent to execute a specific task, skip this skill.\n\n\n\nIf you think there is even a 1% chance a skill might apply to what you are ``` ## Assistant **Tool call: `Bash`** ```json { "command": "python3 -c \"\nimport json\nseen_types = {}\nwith open('/Users/peter/.claude/projects/-Users-peter/407f5b26-b24a-4350-8b16-308c14fb3360.jsonl') as f:\n for line in f:\n obj = json.loads(line)\n t = obj.get('type', 'no-type')\n seen_types[t] = seen_types.get(t, 0) + 1\n if 'message' in obj:\n role = obj['message'].get('role', '?')\n seen_types[f'{t}/{role}'] = seen_types.get(f'{t}/{role}', 0) + 1\nfor k, v in sorted(seen_types.items()):\n print(f'{v:5d} {k}')\n\"", "description": "Catalog event types in JSONL" } ``` ## User **Tool result:** ``` 127 assistant 127 assistant/assistant 16 attachment 50 file-history-snapshot 19 last-prompt 20 permission-mode 27 system 82 user 82 user/user ``` ## Assistant **Tool call: `Write`** ```json { "file_path": "/Users/peter/jsonl_to_md.py", "content": "#!/usr/bin/env python3\n\"\"\"Convert a Claude Code session JSONL to a readable markdown transcript.\"\"\"\nimport json\nimport sys\nfrom pathlib import Path\n\n\ndef render_content(content):\n \"\"\"Render a message's content (str or list of blocks) as markdown.\"\"\"\n if isinstance(content, str):\n return content.strip()\n\n parts = []\n for block in content:\n btype = block.get(\"type\")\n if btype == \"text\":\n parts.append(block.get(\"text\", \"\").strip())\n elif btype == \"thinking\":\n thinking = block.get(\"thinking\", \"\").strip()\n if thinking:\n parts.append(f\"
thinking\\n\\n{thinking}\\n\\n
\")\n elif btype == \"tool_use\":\n name = block.get(\"name\", \"?\")\n inp = block.get(\"input\", {})\n inp_json = json.dumps(inp, indent=2, ensure_ascii=False)\n parts.append(f\"**Tool call: `{name}`**\\n\\n```json\\n{inp_json}\\n```\")\n elif btype == \"tool_result\":\n tc = block.get(\"content\", \"\")\n if isinstance(tc, list):\n tc = \"\\n\".join(\n b.get(\"text\", \"\") if isinstance(b, dict) else str(b)\n for b in tc\n )\n tc = str(tc).strip()\n if len(tc) > 4000:\n tc = tc[:4000] + f\"\\n\\n... [truncated, {len(tc)} chars total] ...\"\n parts.append(f\"**Tool result:**\\n\\n```\\n{tc}\\n```\")\n elif btype == \"image\":\n parts.append(\"*[image omitted]*\")\n else:\n parts.append(f\"*[unhandled block type: {btype}]*\")\n return \"\\n\\n\".join(p for p in parts if p)\n\n\ndef main(jsonl_path, out_path):\n out_lines = []\n out_lines.append(f\"# Claude Code session transcript\\n\")\n out_lines.append(f\"Source: `{jsonl_path}`\\n\\n---\\n\")\n\n with open(jsonl_path) as f:\n for line in f:\n obj = json.loads(line)\n t = obj.get(\"type\")\n if t not in (\"user\", \"assistant\"):\n continue\n msg = obj.get(\"message\", {})\n role = msg.get(\"role\", t)\n content = msg.get(\"content\", \"\")\n rendered = render_content(content)\n if not rendered:\n continue\n heading = \"## User\" if role == \"user\" else \"## Assistant\"\n out_lines.append(f\"\\n{heading}\\n\\n{rendered}\\n\")\n\n Path(out_path).write_text(\"\\n\".join(out_lines))\n size = Path(out_path).stat().st_size\n print(f\"wrote {out_path} ({size} bytes, {size/1024:.1f} KB)\")\n\n\nif __name__ == \"__main__\":\n main(sys.argv[1], sys.argv[2])\n" } ``` ## User **Tool result:** ``` File created successfully at: /Users/peter/jsonl_to_md.py ```