From System of Record to System of Context & Work
A thesis on how agent-first software replaces the incumbent SoR stack (NetSuite, Salesforce, Zendesk, Workday, Epic). Develops the five-primitive flywheel architecture — outcome, trajectory, eval, substrate, aggregation — that explains why and what compounds. Eight SoR segments mapped, four meta-archetypes derived, per-company decomposition of Campfire, Rillet, Sierra, Decagon, Clay, Mercor, EliseAI and others.
Part 1 — The Argument
1.1 The Inversion
For forty years, enterprise software was built on one architectural assumption: the human is the interface. Schemas, forms, role-based dashboards, per-seat pricing, configured rules — all of it derives from the assumption that knowledge work is humans typing structured intent into rows so other humans can query those rows later. The system of record (SoR) was, fundamentally, a database with a UI thick enough to keep the data clean.
Agent-as-interface breaks this assumption. When a model — not a person — sits between the work and the database, almost every architectural primitive of the previous era is wrongly shaped. Schemas are too rigid. Forms are unnecessary friction. Reports are on-demand reasoning. Customization is markdown, not code. Pricing is per-outcome, not per-seat. Most importantly: the job of the SoR changes. It stops being the place where truth is stored and starts being the substrate that feeds a learning loop.
This is the actual architectural inversion. Surface-level reframings — “AI-native CRM,” “AI-native ERP,” “agent-first support” — describe the marketing, not the architecture. The deeper change is that the entire data shape pivots from “validated state at point-of-entry” to “trajectory + outcome flowing through a loop.” Companies that get this right become systems of context and work. Companies that don’t ship agents on top of the same legacy substrate.
1.2 The Flywheel Architecture
The substrate of learning
Outcome-Grounded Signal
Verifiable success arriving from the world without human annotation. Trial balance reconciles. Ticket resolved. Claim paid. Lease signed. Without it, no gradient.
The corpus of learning
Trajectory Persistence
Every step preserved as data — retrievals, tool calls, intermediate outputs, rejected paths. Schema-first SoRs persisted committed state only; agent-first systems persist the cognitive trajectory.
The gradient of learning
Attribution + Eval
Replays, regressions, isolated subagent traces. Sierra's 'stress tests with trick questions' before launch. Cursor's SWE-bench Pro. Without it, you have data but no causal direction.
The parameter space
Editable Policy Substrate
Markdown skills, prompt templates, evals, routing tables, memory graphs, fine-tunes. Multi-versioned, expert-authorable. Decagon's AOPs compile natural language into agent behavior.
The multiplier
Cross-Customer Aggregation
Multi-tenant on the learning layer even when isolated on the facts layer. Campfire's LAM trains across customers; customer 100 inherits patterns from 1 through 99. Without it, fork economics.
The learning loop is the actual structural unlock. Five primitives must interlock for it to spin. Take any one out and the loop doesn’t spin — you have a chatbot, not a learning system.
(1) Outcome-Grounded Signal. Every agent run terminates in a verifiable outcome that arrives back from the world without human annotation. The trial balance reconciles. The ticket is resolved without escalation. The claim is paid. The lease is signed. The signal is binary or graded but mechanical, not human-judged. This is what the traditional SoR fundamentally lacked. SoRs persisted states (the row got created), not outcomes (the work that row represented succeeded against an external criterion). Domains where (1) is dense — code, accounting, customer service, leasing — adopt agents fastest. Domains where (1) is sparse or slow — strategy, creative work, multi-year legal matters — adopt slowest.
(2) Trajectory Persistence. Every step of every run preserved as data: what was retrieved, which tool was called, which model gave which intermediate output, what was rejected, what alternative was considered. The trajectory IS the training corpus. Schema-first SoRs persisted final committed state only; they had no representation of how that state was produced because the producer was a human and the cognitive trajectory lived in the human’s head. (2) only matters because of (1): without an outcome to attribute the trajectory to, the trajectory is exhaust.
(3) Attribution + Eval Architecture. Structured machinery for answering “what caused this outcome.” Regression evals on every change. Replay against new prompts and models. Isolated subagent traces. Sierra’s stress-tests with trick questions before live launch; Cursor’s SWE-bench Pro tied to Tab-model retrains; Decagon’s Trace View as an in-product debugger. Without (3) you have data but no gradient — you can’t tell which parts of the trajectory caused success or failure, so you can’t update the substrate intelligently.
(4) Editable Policy Substrate. The writable surface that the loop updates. Markdown skills, prompt templates, eval suites, routing tables, memory graphs, fine-tunes. Multi-versioned, expert-authorable, model-agnostic. SoR customization was rules-as-code (Salesforce custom objects, NetSuite SuiteScript, Workday Studio) — every customer became a divergent fork and learning never transferred back. Substrate-based customization is convergent: the loop writes back into a shared structure that all customers benefit from. The substrate is also where domain experts now author logic directly, collapsing the talent-scarcity moat.
(5) Cross-Customer Aggregation Surface. The substrate is multi-tenant on the learning layer even when data is isolated on the facts layer. Campfire’s LAM trains across customers; customer 100 inherits the categorization patterns and reconciliation heuristics that customers 1–99 produced. Sierra’s routing optimization learns across deployments. Clay’s recipe library propagates. Without (5), each customer is a fresh start, and (4) collapses back to fork-style customization (i.e. SoR economics).
The mechanic. An agent run executes; it produces an action in the world. The world returns an outcome (1). The full trajectory is captured (2). Attribution (3) maps the outcome backward through the trajectory to identify what caused success or failure. The substrate (4) updates with the lesson. Aggregated across customers (5), the lesson flows to all future runs. The next outcome arrives, and it’s more likely to succeed.
The speed of one cycle of this loop is the durability metric. Cursor retrains its Tab model every 90 minutes from 400M daily accept/reject signals — the densest known cycle in production. Sierra’s constellation routing adapts in near-real-time. Campfire’s LAM ingests every customer transaction continuously. Cycle time is the moat. Companies whose architecture lets them turn the crank faster compound faster.
1.3 Why Incumbents Cannot Retrofit
| Primitive | Incumbent SoR | AI-native SoR |
|---|---|---|
| (1) Outcome-Grounded Signal | Partial — only at terminal states | Dense — per-transaction, per-conversation |
| (2) Trajectory Persistence | Near-zero — schema persists state, not reasoning | By default — runs are first-class |
| (3) Attribution + Eval | Near-zero — A/B on workflows, not reasoning | Replays + regressions + supervisor agents |
| (4) Editable Policy Substrate | As forks — every customer is divergent | Convergent — markdown skills + AOPs |
| (5) Cross-Customer Aggregation | Aggregate analytics only | Multi-tenant on learning layer |
Three structural reasons.
First, architectural absence of (1)–(3). Incumbents have at most partial outcome signal capture; near-zero trajectory persistence; near-zero attribution/eval architecture. They can ship agents on top of legacy schemas (Salesforce Agentforce, NetSuite AI Close, Workday Illuminate, Service Cloud Einstein) but cannot rebuild the substrate without a multi-year data architecture rewrite that no incumbent has yet committed to publicly.
Second, commercial misalignment with (4) and (5). Per-seat pricing penalizes substrate compounding — every successful agent deflection is a cancelled seat license. Salesforce cannot ship a substrate-aggregating agent that genuinely compresses headcount because doing so breaks its own revenue model. ServiceNow’s $2.85B acquisition of Moveworks in December 2025 was the way out: pay an external party to build the harness, then bolt it on without disturbing the seat economics in the short term.
Third, organizational dependence on the customization economy. Salesforce has $30B+/year of SI partner revenue (Accenture, Deloitte, Capgemini, Slalom) that is structurally threatened by markdown-as-substrate. The same is true of NetSuite (BDO, Armanino, RSM) and Workday (Mercer, Deloitte, KPMG). The political coalition inside the incumbent that resists the substrate move is bigger than the engineering team that would have to build it.
Combined effect: incumbents move faster than expected in shipping agent surfaces (Agentforce, Joule, Illuminate, Now Assist) and slower than required in rebuilding the substrate. The gap widens, not closes, between 2026 and 2028.
1.4 The Org-Level Consequence
The line worker disappears. The “Salesforce admin,” the “NetSuite implementer,” the “Zendesk agent,” the “billing analyst,” the “claims processor” — roles that existed because someone had to sit inside the SoR moving rows around — collapse into two new shapes:
- The agent operator configures, monitors, and improves the agent: writes and edits skills, designs evals, tunes routing, manages model upgrades. Sierra’s Workspaces productize this role explicitly — parallel-development branches, reviewable changes, controlled releases. The Campfire customer running accounting at $250M ARR with a single controller is operating the agent, not doing the books.
- The exception handler solves cases the agent escalates. Each handled exception becomes training data; the gate widens; the role shrinks in volume but remains for the highest-judgment edge cases.
Tribal knowledge — “what to do when X happens” that used to live in a senior person’s head, a Slack thread, a half-written runbook — is forced into one of three structured forms: a markdown skill, an eval, or a curated trajectory. Knowledge accrues to the company, not the individual — but only at companies whose architecture supports the artifact form.
The buying center shifts from CIO IT spend to COO/CFO labor or operations budgets. The labor budget is 35× the size of the software budget ($11T vs $315B per NEA / FRED data); the buyer has a faster decision cycle and a higher tolerance for outcome-based contracts. This is why AI-native P&Ls can sustain at 50–70% gross margins (Redpoint 2026) — they’re priced into a much larger pool against a much higher willingness-to-pay.
Part 2 — The Field
Five segments. For each: the incumbent landscape and exposure, one exemplar in depth, a comparison table for the rest, and a one-paragraph teaching at the end.
Incumbents. SAP, Oracle, Microsoft Dynamics 365 F&O at the enterprise level; NetSuite, Sage Intacct, Workday Financials at mid-market; QuickBooks and Xero at SMB. Adjacent stacks: BlackLine and FloQast (close), Bill.com / Tipalti / AvidXchange (AP/AR), Anaplan / Pigment (FP&A), Coupa / Ariba (procurement). Per Redpoint’s 2026 CIO survey, 50% of CIOs are open to replacing their ERP with an AI-centric vendor — tied with procurement, second only to CRM (83%) and CX (56%).
Why finance is the densest signal surface. Three properties make ERP the most architecturally suitable SoR for the flywheel: outputs are numerically verifiable against GAAP (the trial balance reconciles or it doesn’t); the workflows (categorize → match → reconcile → close) repeat millions of times per customer per year, generating dense trajectory data; outcome attribution is mechanical — every journal entry is right or wrong against external bank/vendor truth. (1), (2), (3) are dense by default. (4) and (5) are the engineering problem.
Campfire — system of action, not record
Funding (12 weeks, A+B)
$100M
Accel · Ribbit · Foundation
NetSuite migrations
100+
JPMorgan, Jan 2026
LAM accuracy
95%+
Reconciliation, variance analysis
Flex close cycle
10d→3d
QuickBooks → Campfire
Flex headcount needs
−67%
60K txns auto-mapped/mo
Fooji close cycle
15d+→3d
NetSuite → Campfire
Customer evidence
Switching to Campfire from NetSuite has been an absolute game-changer. As a CFO running a multi-currency, multi-legal entity operation, I've dealt with clunky, frustrating accounting software before. Campfire? A totally different story.
John Glasgow founded Campfire in 2023 to “upend 1990s-era ERP like NetSuite” with an LLM-powered alternative. $100M raised in 12 weeks across Series A and B (Foundation + Ribbit + Accel), $375M post-money. 10× revenue increase YTD. 100+ companies migrated from NetSuite/QuickBooks. CareRev calls it “Ramp for accounting.”
The architecture maps cleanly to the five primitives. (1) Outcome: every reconciliation either ties to bank/vendor truth or doesn’t; every journal entry passes the trial-balance check or doesn’t — dense per-transaction signal. (2) Trajectory: Campfire persists every reconciliation attempt, every category assignment, every allocation decision with the LLM’s reasoning — not just the final journal entry. This is what makes the LAM (Large Accounting Model) trainable. (3) Attribution: LAM is benchmarked at 95%+ accuracy on reconciliation and variance analysis. (4) Substrate: LAM weights themselves, plus Ember — the conversational AI assistant where finance teams query data and automate workflows in natural language. (5) Aggregation: LAM trains across all 100+ customers; a new customer’s books inherit categorization patterns and reconciliation heuristics learned from prior customers’ books.
What Campfire does that NetSuite couldn’t. NetSuite persists committed state — GL accounts, journals, customer/vendor masters. It has no schema for “the alternative allocation the agent considered and rejected because of a vendor pattern.” It has no LAM because it has no trajectory store to train one on. NetSuite’s customization is SuiteScript — fork-per-customer, unaggregable. Its pricing is per-seat plus implementation, which actively discourages reducing the number of accountants required.
Customer evidence (JPMorgan, January 2026). Flex (~80 employees, fintech): close cycle 10 days → 3 days, 60K transactions auto-mapped per month, 67% lower headcount needs. Fooji (experiential marketing): close 15+ days → 3 days, eliminated NetSuite consulting spend, finance “shifted from book-keeping to strategic business partnership.” CFO testimonial: “Switching to Campfire from NetSuite has been an absolute game-changer. As a CFO running a multi-currency, multi-legal entity operation, I’ve dealt with clunky, frustrating accounting software before. Campfire? A totally different story.”
The rest of the segment
| Company | Funding · Valuation | Traction | What they uniquely do |
|---|---|---|---|
| Rillet | $25M Series A + $70M Series B; $500M val | 200+ customers, ARR doubled in 12 weeks | “Built by accountants” — SaaS-vertical-specialized GL with 99.7% auto-bookings. Allovue migrated from Sage Intacct in 1 week; Windsurf hit $100M ARR with a 2-person finance team on Rillet. |
| DualEntry | $90M Series A; $415M val | $100B+ journal entries processed; 13,000+ integrations | NextDay Migration (24-hour cutover vs typical 6-month NetSuite migration). Slash neobank runs $100M ARR ops with 1 controller. The migration engine itself is the flywheel output. |
| Digits | $97.5M total; $565M val | 93% auto-book accuracy | Autonomous General Ledger trained on $825B+ transaction data. Wispr: financial-question latency 3 hr → 10 min. |
| Light | $30M Series A | 30× growth in 12 mo | EU/global multi-jurisdiction GL — cross-jurisdiction tax/reporting normalization is its specific aggregation moat. |
| Doss | $55M Series B | inventory-on-ERP middleware | Explicitly not a GL replacement — bets on (5) at the inventory-to-GL reconciliation layer. |
| Numeric | $51M Series B | “hundreds” of customers | Close / reconciliation Workflow Wedge displacing BlackLine. Expanding to “compound startup” — cash management next. |
| Basis | $100M Series B; $1.15B val | 30% of top-25 accounting firms | AI agents for accounting firms (tax/audit/advisory). Different buyer, firm-level substrate aggregation. |
What this segment teaches. Finance is the canonical proof-of-thesis segment. The five primitives are dense by default; the engineering problem is shipping (4) and (5) faster than incumbents can retrofit (1)–(3). Campfire / Rillet / DualEntry are building the GL Replacer archetype simultaneously and not yet competing for the same customers. By 2027 we expect 1–2 GL Replacers above $100M ARR; by 2028 the question is whether NetSuite acquires one or rebuilds.
Incumbents. Salesforce Sales/Service Cloud, HubSpot, Microsoft Dynamics 365 CE, Zoho, Pipedrive. Sales engagement: Outreach, Salesloft (merged with Clari Dec 2025, ~$450M combined ARR), Apollo, Gong. 83% CIO replacement openness — the highest of any category.
Why CRM is the most exposed SoR. All of Salesforce’s lock-in was UI/process, not architecture or data. Three Salesforce moats were UI-based: admin labor encoded as custom objects/validation rules/page layouts; rep muscle memory across pipeline views and dashboards; the SI implementation economy. When the agent IS the interface, all three vaporize. The data is portable (the customer’s, not Salesforce’s), there’s no regulatory lock-in (CRM is unregulated), and the transactions don’t pass through Salesforce. The three-question moat test (proprietary data / regulatory / transaction embedding) yields zero structural moats for Salesforce.
Clay — system-of-action for GTM
ARR (CEO to NYT)
$100M
Tripling YoY 2025
Series C / val
$3.1B
CapitalG, Jun 2025
Lifetime AI agent tasks
1.5B
Across customers
OpenAI enrichment
2×
Coverage uplift
Vanta enrichment coverage
fragmented→80%+
1,000+ contacts/mo
Kareem Amin (ex-WSJ VP Product) and Nicolae Rusan, founded 2017. $204M total; $100M Series C at $3.1B (CapitalG, June 2025) following a $1.5B Sequoia tender and a $1.3B Series B extension. $100M ARR (CEO confirmed to NYT, tripling YoY). Customers: OpenAI, Anthropic, Canva, Intercom, Rippling. 1.5B lifetime AI agent tasks. Credit-based pricing.
- Outcome: enriched record matched correctly, signal that triggered a converted outbound, response-rate uplift. (2) Trajectory: every workflow’s full execution trace preserved — which data provider was tried first, which was tried second, what was rejected, which signal triggered which action. (3) Attribution: customers A/B test workflows; recipes can be benchmarked. (4) Substrate: spreadsheet-based programmable workflows + Claygent (research agent). The substrate is the workflow recipe — a versionable, shareable, copy-pasteable unit of GTM logic. (5) This is Clay’s real moat. 60+ Clay Clubs worldwide, 400+ GTM engineer roles posted in a single spring 2025 hiring cycle, customer-led GTM agencies scaled to $1M+ ARR within a year. The recipe library aggregates across all customers.
What Clay does that Salesforce couldn’t. Salesforce processes Activity rows. Clay processes signals (job changes, news, intent, technographics) and executes on them via a programmable workflow that is itself a learned artifact. The “GTM engineer” role didn’t exist before Clay because the substrate didn’t exist before Clay. Salesforce’s response (Data Cloud) tries to add signal but cannot match the recipe-library compounding because it’s not a cross-customer workflow surface.
Customer evidence (JPMorgan). OpenAI: 2× enrichment coverage, 100% research automated, 8,500+ enrichment runs by team members. Vanta: 80%+ enrichment coverage, 1,000+ contacts/month added. CapitalG partner Jane Alexander: “Clay is the first and only company to take an engineering approach to go-to-market.”
The rest of the segment
| Company | Funding · Valuation | Traction | What they uniquely do |
|---|---|---|---|
| Day.ai | $24M total ($20M Series A, Sequoia) | undisclosed | “Cursor of CRM” — auto-captured emails + meetings as the activity layer. Per-assistant pricing. Founder ex-HubSpot CPO. |
| Attio | $124M total; ~$700M val | 5,000 customers, 4× ARR trajectory | Custom-object-first; App SDK lets customers build apps inside the CRM. Substrate authorability as the moat. |
| Reevo | $80M seed (Khosla + KP) | undisclosed | Generates first-party activity data with no integrations needed. Founders from Affirm, Airbnb, Box, HubSpot, Salesforce, Rippling, Uber. |
| Clarify | $22.5M Series A | early | “Autonomous CRM,” CDP-inspired event model, pay-per-action pricing. |
| Monaco | $35M (Founders Fund) | stealth-launch Feb 2026 | AI-native CRM + ZoomInfo-like prospecting. Ex-Founders Fund VC + ex-CPO Apollo/Qualtrics. |
| Rox | $50M+; $1.2B val (Sequoia) | undisclosed | “Agent swarm per seller” — research, prep, follow-up. Chris Ré (Stanford) on team. Most credible Role-Replacer architecture. |
| 11x | $74M; ~$350M val | distressed | Cautionary tale. TechCrunch exposed inflated ARR (~$3M of claimed $14M survived pilots), 70-80% logo churn. Industry-wide AI SDR cancellation wave saw 50–70% churn before first renewal. |
| Common Room | (Greylock/Index) | mid-8-figure est. | Community/PLG signals → Roomie AI activation. Multi-channel signal graph; strong for dev-tools GTM. |
| Unify | $58M+; $260M val | 8× revenue YoY | Warm outbound from intent signals. Customers: Cursor, Perplexity, Decagon. |
| Apollo | $150M ARR; $1.6B val | 500K customers | Incumbent sales engagement going AI-first. AI Research Agent claims +46% meetings booked. |
What this segment teaches. Three archetypes coexist: CRM Replacer (Day.ai, Attio, Clarify, Reevo, Monaco) — UX race with the weakest moat; Role Replacer (11x, Rox) — outcome-priced labor replacement, severe trust risk per 11x; Signal Graph (Clay, Common Room, Apollo, Unify) — sit alongside the SoR and own the signal layer. The Signal Graph archetype is the most defensible because (5) — recipe library, contact graph, signal feed — is genuinely cross-customer. Greenfield CRM rebuilds without (5) compete on UX and lose to whoever ships fastest.
Incumbents. Zendesk, Salesforce Service Cloud, ServiceNow CSM, Intercom, Freshworks, Kustomer (Meta-owned), Front. CCaaS: Five9, NICE, Genesys, Talkdesk. CIO replacement openness: 56% — second-highest after CRM.
Why CX has the densest flywheel signal of any segment. Resolution is per-conversation, per-minute verifiable. The unit of work is bounded (one ticket). Outcome surface is huge — Sierra alone handles hundreds of millions of interactions per year. Conversation latency is seconds, not days. (1) and (2) are dense by default. (3), (4), (5) are where the moats live.
Sierra — the constellation harness
Time to $100M ARR
7 quarters
Bret Taylor, Nov 2025
Valuation
$10B
Greenoaks Sep 2025
Constellation models
15+
Per-task adaptive routing
ADT monthly interactions
2M+
Handled autonomously
WeightWatchers CSAT
4.6 / 5
Post-deployment
Containment (week one)
<50%→70%
WeightWatchers
Customer evidence
I knew the AI agent would answer questions quickly, but I didn't expect the responses to be so genuine and empathetic. I was reading chat transcripts with members exchanging heart emojis with the AI agent, or seeing AI wish people good luck.
Bret Taylor + Clay Bavor, founded 2023. $635M total, $10B valuation (Greenoaks, September 2025). $100M ARR in 7 quarters since launch (TechCrunch, November 2025), growing from ~$20M a year prior. 50%+ of customers have $1B+ revenue; 20%+ have $10B+. Reach: 95% of US Black Friday shoppers, 50% of US healthcare families, >90% of US media ecosystem, >70% of US fintech. Outcome-based pricing — “if the AI agent has to transfer to a real person, it’s free” (Bret Taylor).
- Outcome: resolution-without-escalation per conversation; the pricing model ensures Sierra has commercial incentive to maximize signal density. (2) Trajectory: every conversation, with all branching and tool-call decisions, preserved — 2M+ conversations per month feeding the loop. (3) Attribution: stress-tests with trick questions before launch, built-in guardrails and audit systems, Workspaces model for parallel testing, supervisor models acting as “Jiminy Cricket” on factuality and policy. (4) Substrate: the constellation of 15+ models is the substrate — adaptive routing per task (low-latency for tool calls, high-precision classifiers for fraud, long-context reasoners for knowledge); AIMD admission control adapted from TCP congestion control; planner-executor-validator pattern. Workspaces productize the substrate edit cycle: branches, reviews, controlled releases. (5) Aggregation: routing optimization compounds across deployments within brand-isolation constraints.
What Sierra does that Zendesk couldn’t. Zendesk’s data model is ticket → conversation → customer → agent assignment. Sierra’s primitive is the agent run — a graph of (intent → plan → tool calls → KB references → policy checks → resolution outcome → memory update). Zendesk’s macros + KB articles are static substrate; Sierra’s constellation is dynamic substrate that learns. Zendesk’s per-seat economics ($55–$169/agent/month + $50/agent AI add-on + $1.50/automated resolution) are in commercial conflict with deflection — Sierra is paid more only when a human is bypassed.
Customer evidence (JPMorgan). ADT: 2 million+ monthly interactions handled autonomously, 70% containment, “warm, conversational, empathetic” tone. WeightWatchers: 70% containment in week one, CSAT 4.6/5. “I knew the AI agent would answer questions quickly, but I didn’t expect the responses to be so genuine and empathetic. I was reading chat transcripts with members exchanging heart emojis with the AI agent.” — Maureen Martin, VP Customer Care, WeightWatchers.
The rest of the segment
| Company | Funding · Valuation | Traction | What they uniquely do |
|---|---|---|---|
| Decagon | $231M; $4.5B val (Bloomberg, Jan 2026) | tens of millions of customers helped | Agent Operating Procedures (AOPs) — natural-language instructions that compile into agent behavior code. Avoids the heavy professional-services model. Rippling deflection 38% → 50%+; NG.CASH 13% → 70%, avoided 35+ hires; Chime 1M+ voice calls/month automated. |
| Parloa | $120M Series C; $1B val | 3M HSE calls/yr automated | Voice-first contact center. Pre-launch simulation testing as the eval architecture. ATU: 1 in 3 appointments booked by AI, staff phone time down 60%. |
| Maven AGI | $78M total | $7M ARR in 5 months | Co-pilot + autonomous hybrid. HubSpot/Stripe/OpenAI exec backers. |
| Lorikeet | $49M (QED) | regulated B2B SaaS | “Universal Concierge” for fintech/healthcare. Customers: Airwallex, Taptap Send, Eucalyptus. |
| Crescendo | n/a | hybrid AI + 3,000 human agents | Labor Bypass archetype in CX — per-resolution + BPO billing. |
| Parahelp | YC + Paul Graham | Perplexity, Framer, Replit, HeyGen | Software-company support; end-to-end ticket resolution for technical products. |
| Ada | incumbent pivot | doubled YoY Mar 2026 | “ACX” category framing. |
What this segment teaches. CX is the most-validated agent-native SoR replacement because (1)+(2) are densest, (3) is most measurable, and customer pain is acute. Sierra dominates F1000 enterprise; Decagon dominates high-growth tech enterprise; Parloa dominates voice-first enterprise. By 2027, Service Cloud’s per-seat revenue will be in visible decline as deflection ramps eat the seat count. Zendesk’s hybrid pricing is a transitional artifact that will collapse to per-resolution.
Incumbents. ServiceNow, Atlassian Jira Service Management, BMC Helix, Ivanti.
Serval — the access-management wedge
Series B
$75M
Sequoia, Dec 2025
Valuation
$1B
Total raised $127M
ARR
~$50M
500% growth since Series A
Help-desk volume
30–50%
Access requests as wedge
$127M total, $1B valuation (Sequoia, December 2025). ~$50M ARR; revenue grew 500% since Series A in August 2025. Wedge: provisioning access requests (SaaS apps, permissions, on/offboarding) — 30–50% of help desk volume.
- Outcome: did the access actually get granted to the right person on the right system — binary verifiable, fast feedback (minutes). (2) Trajectory: chain of identity lookups, group memberships checked, approval routing decisions. (3) Eval on access-grant accuracy across hundreds of SaaS integrations. (4) Substrate: identity policies + per-app workflow templates + supervisor agents. (5) Cross-customer access patterns — Salesforce + Slack + Notion + Datadog + GitHub + AWS = the same canonical onboarding shape across thousands of orgs.
What Serval does that ServiceNow couldn’t. ServiceNow’s app marketplace was always thin — every customer had to build integration scripts. Serval ingests hundreds of SaaS integrations natively. Access-grant is one of the few IT processes where (1) is binary AND (5) is highly transferable. ServiceNow’s $2.85B Moveworks acquisition in December 2025 was the cleanest possible admission: rebuilding the front-end harness from scratch was slower than buying it.
The rest: Moveworks (now ServiceNow) — F500 employee front-door; HR + IT + Finance intent surface. Atomicwork ($40.3M, $25M Series A from Khosla/Okta Ventures) — agentic ITSM + employee experience. Aisera (acquired by Automation Anywhere) — pre-LLM AI; couldn’t compete with Serval’s substrate quality.
What this segment teaches. Two archetypes — horizontal employee front-door (Moveworks-style; high TAM but commoditized post-acquisition) and vertical access wedge (Serval-style; deeper compounding because the workflow is more verifiable). ServiceNow’s installed base will absorb Moveworks but will not be able to ship a cross-app substrate as deep as Serval’s by 2027.
Healthcare, legal, HRIS, real-estate / construction / field-service. The pattern across all four: regulation depth is inversely correlated with SoR replaceability. Healthcare EHR (Epic), HRIS (Workday), BigLaw matter management (iManage) are not replaceable; the AI-native value capture is forced into adjacent layers — scribe, RCM, recruiting, vertical practice rebuilds, labor delivery. Lightly-regulated SoRs (mid-market real estate operations, SMB field service) are full-replacement targets.
Healthcare (EHR-adjacent)
Epic is not replaceable. HIPAA, HL7/FHIR pipes, claims adjudication, hospital deployment cycles measured in years. CIO openness to replacing the EHR ≈ 0%. AI-native value capture is forced into adjacent layers.
| Play | Pattern | Examples |
|---|---|---|
| Epic-tax via Workshop | Pay revenue-share to Epic for plugin slot | Abridge ($5.3B val, $100M+ ARR, 60K+ clinicians), Ambience ($1.25B val) — ambient scribing |
| Direct-to-clinician bypass | Skip the EHR entirely; ad-supported | OpenEvidence ($12B val, 40% of US physicians) — clinical reasoning with cited literature |
| Labor bypass for nursing | Sell the work, not the software | Hippocratic ($3.5B val, 115M+ patient interactions) — discharge calls, follow-ups |
| RCM / autonomous coding | Verifiable transaction loop | Rapid Claims, Codametrix, Augmedix — claim paid is dense (1) |
What this teaches. Epic absorbs ambient scribing via Workshop with revenue-share extraction — avoid pure scribe. Best plays: RCM/coding (verifiable transaction loop generates true training signal), direct-to-clinician reasoning that bypasses both EHR and HIPAA-aggregation constraint, and patient-facing labor displacement.
Legal (Practice & Matter Management)
Incumbents. Clio (200K+ lawyers; $5B val post-vLex acquisition), MyCase, NetDocuments, iManage (BigLaw DMS), LexisNexis/Westlaw/Bloomberg Law.
| Play | Pattern | Examples |
|---|---|---|
| Workflow layer | Stays adjacent to the SoR | Harvey ($8–11B val, $195M ARR, BigLaw partnerships, BigLaw Bench as the eval substrate) |
| Vertical SoR rebuild | Practice-area specific full-stack | Eve Legal ($1B val) — plaintiff PI; 450 firms, 200K cases/yr, $3.5B settlements influenced. Closest legal company to “Fortress.” |
| Drafting / redlining wedge | Workflow Wedge | Spellbook, Definely, Lexion |
| Regulatory compliance agent | Narrow regulated wedge | Norm AI |
What this teaches. Avoid horizontal practice-management replacements (Clio absorbed vLex; iManage too entrenched). Best path: vertical practice-area rebuilds (Eve in plaintiff PI; immigration, IP, M&A as next targets). Harvey is a workflow layer with a defended-but-not-fortress moat — long-term durability requires moving up to outcome ownership.
HRIS / Recruiting
Workday, Rippling, ADP, BambooHR are structurally protected. FLSA/EEOC/ACA/multi-state-payroll/SOX plumbing identical to Epic’s. No AI-native HRIS unicorn exists. Workday’s “platform of agents” repositioning (April 2026) and Sana acquisition signal the absorption pattern. CIO openness to replacing HRIS: ~25–35%, lowest of any segment in this study.
The unlock is in the labor flow, not the SoR.
| Play | Pattern | Examples |
|---|---|---|
| Labor Bypass marketplace | Absorb the contractor relationship, sidestep HRIS | Mercor — $10B val, $1B ARR. Charges enterprises for placement; pays contractors 60–70%. Agency model, not software. |
| AI-native ATS | Recruiting workflow ≠ HRIS | Ashby (Series C); Juicebox / PeopleGPT ($36M, 2,500 customers, $10M ARR with 4 people) |
| Talent intelligence layer | ML on top of HRIS | Eightfold AI ($410M, $2.1B), Gloat (~$1B). Thin moat as Workday native AI catches up. |
| Vendor absorption | Acquired by HRIS | HiredScore → Workday (March 2024) |
What this teaches. Bypass the SoR via labor delivery. Best plays: Mercor-style labor-as-a-service, vertical recruiting agents (clinical, sales, blue-collar) where outcome is measurable, compliance-automation wedges (immigration, multi-state payroll). Avoid horizontal HRIS replacement — Workday will eat them.
Real Estate, Construction, Field Service
EliseAI — the operational SoR by interaction-layer expansion
Valuation
$2.2B+
Series E
US apartment reach
1 in 12
Cross-customer aggregation
NMHC Top 50 reach
70%
Operational SoR
Interaction automation
90%
Resident-facing
$2.2B+ valuation (Series E). Touches 1 in 12 US apartments. 70% of NMHC Top 50. 90% interaction automation.
- Outcome: lease conversion (binary, high-frequency). (2) Trajectory: leasing-conversation graph at apartment density. (4) Substrate: Fair-Housing-compliant policy substrate. (5) Strongest cross-customer aggregation in any operational vertical — conversation patterns aggregate across one-in-twelve US apartments.
What this teaches. EliseAI is the cleanest “AI-layer becomes operational SoR” arc in the study. The path: own the resident interaction → own delinquency recovery → own maintenance triage → eventually displace the GL. Yardi increasingly just becomes the GL while EliseAI owns the operational layer. Now expanding to healthcare with the same playbook.
Construction & field service — Procore is becoming an agentic platform itself; ServiceTitan is mostly safe at enterprise but vulnerable at SMB (~60% replacement openness). AI-native value: Document Crunch (acquired by Trimble Q2 2026, modest exit — confirming the “AI-layer thesis,” no path to standalone SoR), Buildots (computer-vision construction progress as a data moat), Trunk Tools (Procore-side, Jasper-risk if Procore native AI catches up), Fieldproxy / Fixlify / Quantra (AI-native ServiceTitan alternatives at SMB).
Part 3 — The Implications
3.1 The Four Meta-Archetypes
Across all five segments, AI-native SoR-replacement plays cluster into four archetypes. Each archetype is defined by which subset of the five primitives dominates its compounding curve, and each has a different moat profile.
GL Replacer (Greenfield SoR Rebuild). Examples: Campfire, Rillet, DualEntry, Digits (ERP); Day.ai, Attio, Reevo (CRM at the substrate-replacer end); EliseAI on its long-term arc. Primitives that dominate: (4) editable substrate + (5) cross-customer aggregation. The pitch is “rebuild the SoR from scratch with the substrate as the primary architectural commitment.” Moat profile: strong on all three test dimensions for finance/ERP plays (proprietary GL data + GAAP/SOX regulatory + transaction embedding); weak on all three for greenfield CRM rebuilds. Compounding rate: linear-to-superlinear in customer count for finance (LAM trains across all customers); linear at best for CRM (each customer’s data graph is private). When it wins: lightly-customizable, high-frequency-outcome SoR domains where the incumbent’s customization economy is shallow and the data is portable.
Workflow Wedge. Examples: Numeric (close), Basis (audit firms), Spellbook (legal drafting), Maven AGI (CX co-pilot), Lorikeet (regulated B2B support), Document Crunch (construction contracts), Codametrix (medical coding), Norm AI (regulatory compliance). Primitives that dominate: (1) outcome signal + (3) attribution/eval. Wedges live on the densest single workflow inside a vertical. Their architectural commitment is narrow but deep eval — they own the benchmark for “good close” or “good audit workpaper” or “good redline.” Moat profile: medium. Survival path: expand from wedge to system of work to SoR by absorbing adjacent workflows (Numeric’s “compound startup”). Either you make the leap to GL Replacer or you sell to the SoR (Document Crunch → Trimble is the canonical Wedge exit).
Concierge / Brand Layer. Examples: Sierra, Decagon, Parloa (CX), EliseAI (real estate operational layer), Hippocratic (nursing), Abridge (clinician-side scribe). Primitives that dominate: (2) trajectory persistence + (4) editable substrate, with (5) constrained by brand isolation. Moat profile: strongest on data — trajectory volume and brand-customized substrate are both proprietary by construction. Transaction embedding strong — they ARE the resolution channel / interaction surface. Best 3Q profile of any archetype. Compounding rate: quadratic with conversation density per customer × number of customers. Sierra at hundreds-of-millions of interactions/year is the proof. Strategic move up the stack: the operational-layer-becomes-SoR arc (EliseAI) — the Concierge captures the customer interaction → captures the resolution → captures the next-best-action → eventually captures the SoR’s job. The most powerful long-term path.
Labor Bypass. Examples: Mercor (placement), Crescendo (CX BPO hybrid), Eve Legal (plaintiff PI vertical labor), Hippocratic (nursing tasks). Primitives that dominate: (1) outcome (the placement, the resolved case, the closed settlement) + (5) aggregation (placement graphs, settlement comps). Moat profile: outcome-ownership is the strongest moat. Transaction embedding strong — they are the labor delivery channel. Compounding rate: sublinear-to-linear with placement count, but the unit economics are agency-style (60–70% pass-through), not software-style. Higher revenue, lower margin. When it wins: in regulated verticals where the incumbent SoR is structurally protected (HRIS, EHR, BigLaw matter mgmt) so the AI-native must absorb the labor flow rather than replace the SoR. Caution: 11x demonstrated that AI-Employee role-replacement without measurable outcome accountability is a trust-collapse archetype masquerading as Labor Bypass. Real Labor Bypass requires real outcome (Mercor’s placement-and-retention guarantee; Crescendo’s resolution SLA).
3.2 The Compounding Dynamic
Each archetype runs the flywheel at a different cycle time and accumulates a different aggregation surface. The product of the two determines the compounding rate. The compounding rate determines who survives.
The Two-Curve Crossover
Incumbent SoR revenue declines on a slope (multi-year contracts decay 5–10%/yr post-2027). AI-native revenue compounds on a curve. Crossover happens segment-by-segment between 2027 and 2029.
| Archetype | Cycle time per outcome | What aggregates | Compounding rate | Best example |
|---|---|---|---|---|
| GL Replacer | Per transaction (sub-second to minutes) | Categorization + reconciliation + RevRec patterns; LAM weights | Linear-to-superlinear in customer count | Campfire LAM across 100+ customers |
| Workflow Wedge | Per workflow run (minutes to hours) | Workflow-specific patterns within vertical | Narrow but fast within wedge | Numeric flux/recon patterns |
| Concierge / Brand Layer | Per conversation (seconds to minutes) | Routing + skill optimization (cross-tenant); brand voice (intra-tenant) | Quadratic in (conversation density × customers) | Sierra constellation, 2M+ convos/month |
| Labor Bypass | Per placement / case / shift (hours to weeks) | Placement graph; matter-outcome graph | Linear with placement count | Mercor placement graph |
The flywheel is visible in financial trajectory. Sierra hit $100M ARR in 7 quarters. Campfire reported 10× revenue YTD. Decagon went 0 → 8-figure ARR in under a year. Mercor reached $1B ARR in two years. These aren’t standard SaaS growth curves — they’re compounding-curve growth, where each customer makes the product more capable for all future customers, which justifies higher per-outcome pricing, which funds faster substrate iteration, which closes the loop.
Incumbents cannot match this rate. Three reasons, restated: architectural absence of (1)–(3); commercial misalignment with (4)–(5) under per-seat pricing; and the customization-economy political coalition. The combined effect is that incumbents move fast in shipping agent surfaces and slow in rebuilding the substrate — the gap widens, not closes, between 2026 and 2028.
Five-year picture. Incumbents on a slope (multi-year contracts decay 5–10% per year post-2027 as renewals shift to AI-native vendors); AI-natives on a curve (compounding with each model release). The crossover happens segment-by-segment:
- 2027 — CRM (Salesforce vs Day.ai / Clay / Attio / Rox aggregate)
- 2027–28 — CX (Service Cloud + Zendesk vs Sierra / Decagon / Parloa aggregate)
- 2028 — mid-market ERP (NetSuite / Intacct vs Campfire / Rillet / DualEntry aggregate)
- 2028–29 — ITSM (ServiceNow vs Serval + Atomicwork + ServiceNow’s absorbed Moveworks)
- 2029–30 — enterprise ERP (SAP / Oracle vs the GL Replacers that have moved up-market)
- Probably never as a standalone replacement — EHR, HRIS, BigLaw matter management. Incumbents absorb the AI layer and retain the SoR.
M&A pattern. Incumbents buy the flywheel pieces they cannot rebuild. Already visible: ServiceNow → Moveworks ($2.85B), Trimble → Document Crunch (Q2 2026), Salesforce → Informatica ($8B), Salesforce → Convergence + Bluebirds, Clio → vLex ($1B), Capital One → Brex ($5.15B), Workday → Sana + HiredScore, Zendesk → Ultimate.ai + Forethought, Automation Anywhere → Aisera. Predicted next moves (2026–27): Salesforce buys a Day.ai / Attio / Reevo at $500M–$1B to plug the auto-capture gap; Microsoft buys a Sierra / Decagon competitor (probably Maven AGI or Lorikeet) for M365 Service; NetSuite or Oracle buys one of Campfire / Rillet / DualEntry at $1B–$3B to plug the GL substrate; Workday acquires a vertical recruiting agent (Mercor too expensive at $10B; Ashby-tier ATS more plausible); Epic buys an RCM AI-native (Rapid Claims, Codametrix) once the Workshop revenue-share model has demonstrated ARR.
Pricing end-state. Per-seat dies in CRM, CX, ITSM, and mid-market ERP — segments where (1) is dense and outcome-pricing is enforceable. Per-seat survives in HRIS, EHR, BigLaw, and FP&A — segments where outcome attribution is hard or politically untenable. Per-resolution / per-outcome stabilizes at 50–70% gross margins (Redpoint AI-native P&L data) — structurally lower than 75–85% SaaS, but priced into a 35× larger labor TAM.
3.3 What Would Break the Thesis
Every thesis is testable by what would falsify it. Four conditions would invalidate the compounding-architecture argument.
Foundation models absorb the harness layer. If Anthropic’s Cowork / Skills + MCP commoditizes the harness fast enough that vertical AI-natives can’t keep their architectural lead — i.e. if (3) and (4) become standardized model features rather than platform investments — the moat erodes. Counter-evidence so far: Cursor’s 90-min retrain loop and Sierra’s 15-model constellation have both deepened with each model release rather than dissolved. The harness half-life pattern shows wrappers dying in 4–6 months while learning-loop architectures improve with model upgrades. Holds if AI-natives keep harness investment outpacing model commoditization.
Privacy regulation forces (5) portability. GDPR Article 20 (data portability) extended to learned substrate — i.e. requiring that customer-specific learnings flow with the customer when they leave — would compress the cross-customer aggregation moat. EU healthcare data residency rules already constrain (5) for EU healthcare AI plays. Holds if substrate aggregation is structured as patterns-not-records, which most current AI-natives are doing by default.
Incumbent rebuilds succeed at the substrate level. Workday’s “platform of agents” repositioning is the live test case. If Workday ships a substrate that genuinely compounds across customers within 24 months — not just better agents on top of the existing schema, but a substrate-level rebuild — the thesis is wrong about retrofit infeasibility. Watch: does Workday Illuminate evolve into a Sana-LMS-style learning substrate, or does it stay an agent overlay? Monitor by 2027.
Outcome attribution becomes politically untenable. If buyers refuse to commit to outcome-based contracts (because finance teams cannot defend variable spend, or because procurement loathes the audit complexity), the per-outcome pricing model collapses and AI-natives fall back to per-seat — which destroys the unit economics that fund substrate iteration. Counter-evidence so far: 46% of CIOs are moving to usage/outcome pricing (Redpoint 2026). Holds if outcome-pricing adoption continues at current pace.
If any one of these breaks, the compounding argument has to be re-evaluated for that segment. If all four hold, the architecture is irreversible and the AI-native SoR replacement wave is structural.
Open Questions
Things I genuinely don’t know, in order of analytical importance.
Where does the substrate / model boundary settle? Today, (4) lives partly in markdown skills (editable, substrate-side) and partly in fine-tunes (model-side). Anthropic’s Cowork plugins suggest skills stay editable. Cursor’s Tab-model retrain suggests fine-tuning compounds in model weights. The boundary determines whether AI-natives’ substrate work is durable infrastructure or dissolves into the model. The deepest open question.
Do slow-binding verticals match fast-binding ones’ compounding velocity? Legal (matter outcomes years-long), healthcare (treatment outcomes months-long), strategy (decisions decades-long) cannot run the loop at coding/CX/accounting velocity. Do dense proxy signals (BigLaw Bench, clinical guideline adherence) substitute well enough? Or do these verticals stay structurally less competitive and AI value gets captured by Concierge/Labor-Bypass plays sitting adjacent rather than GL Replacers?
Does cross-customer aggregation survive privacy regulation? GDPR, HIPAA, EU AI Act, and emerging regional data-residency laws all push back on (5). The companies that win their segment will be the ones who design (5) to be regulator-compatible — typically by aggregating patterns not records. The legal status of “learned patterns derived from customer data” is unsettled. EU healthcare AI is the live frontier.
Does the operator role become high-value-add or get compressed by meta-harness automation? Sierra’s Workspaces show the operator role being productized. But meta-harness research suggests harness self-improvement may absorb the operator’s job over time. If the operator gets compressed, the org-level thesis weakens — there isn’t a new role replacing the line worker, the line worker just disappears.
What is the actual M&A price for a Workflow Wedge? Document Crunch → Trimble (modest); Moveworks → ServiceNow ($2.85B); Campfire → ? (untested but likely $1B–$3B if NetSuite buys). The price multiple a Workflow Wedge gets at acquisition tells us how durable the standalone path is.
Does Salesforce’s installed base buy enough time? ~150K customers × multi-year contracts × political resistance to swap = real defensive capacity. Replacement-openness percentages don’t translate immediately into replacements. The 2026–27 ELA renewal cycle is the real test of the 83% openness number.
Sources
Primary. JPMorgan Research, First Principles - AI Agents 2.0: The rise of AI-native new entrants, 8 January 2026 (91 pp) — sections on Rillet, Campfire, DualEntry, Digits, Sierra, Decagon, Parloa, Clay, with G2 customer reviews and case studies; the C.H. Robinson AI-agent ROI case; software revenue model evolution. Redpoint Ventures, 2026 Market Update — CIO replacement openness data, AI-native P&L economics, two-playbook framework.
Frameworks referenced. The Harness Thesis (Verifiability × Regulation; Delegatability × Last-Mile; harness half-life). Vertical AI Platforms three-question moat test (proprietary data + regulatory + transaction embedding). The Memory IS Learning thesis. The Fintool 10-moats analysis (“agent IS the bundle”; “software is becoming headless”). Context-engineering landscape ($10B+ historical context systems). Sierra constellation architecture analysis.
Web evidence (selection). Sequoia partnership posts: Day.ai, Rillet, Rox. TechCrunch: Campfire $35M Series A, Sierra $10B raise, Sierra $100M ARR, Decagon Series D / tender, Doss $55M. BusinessWire: Basis $100M / $1.15B val. Bloomberg: Decagon $4.5B Series D. CNBC: OpenEvidence $12B val. STAT News: Epic AI Charting threatens scribe. Josh Bersin: Workday “Platform of Agents” reinvention. ABA Journal: Clio acquires vLex $1B. Crunchbase: Clay $100M Series C. BlackLine Verity launch. NetSuite 2026.1. SAP at Hannover Messe 2026.
Living document. The five-primitive flywheel framework is the central new contribution; the segment evidence is the proof; the compounding-rate argument is the load-bearing claim. Falsification conditions are listed in §3.3. Updates on receipt of new evidence.