Sambruk Webinar / AI Employees / 1 June 2026

AI Employees

What 100 Virtual Workers Taught Us About the Future of Work

Joseph Benguira / Founder & CTO, Elestio / Founder, GetATeam

Sambruk Webinar / 1 June 2026 / Stockholm

1/19

Who We Are

A global team with European roots

We run a global team, with roots in Europe — EU-resident workloads since day one.

Built for organisations that care where their data lives and who owns the stack underneath.

Not selling you anything today. Sharing what 12 months in production looks like.

The Elestio → GetATeam link matters for this room: we own the full infrastructure stack underneath the AI employees. No mystery cloud. No shared tenancy. Sovereign-deployable.

Elestio

Founded 2022 · Dublin

Managed open-source DevOps platform. 400+ open-source apps deployed on dedicated VMs (not shared Kubernetes) for thousands of customers across EU, US, Asia.

GetATeam

Founded September 2025

Platform to build, run, and monitor AI employees. Sits on Elestio infrastructure which is why we can promise self-hosted, EU-resident, dedicated VMs out of the box.

2/19

What "In Production" Actually Means

May 2026 · our own workforce running on GetATeam

97

AI Employees Running

Deployed inside our company

23

Human Operators

Working alongside them

40.76B

Tokens Processed

May 2026 · 17,048 prompts · 145,921 messages

97.3%

Cache Hit Ratio

The economics moat

What these numbers mean

9,395 hours of cumulative agent runtime in May — the equivalent of ~13 employees working 24/7 in parallel, on a base of 23 humans.
Each human operator effectively runs ~5 AI employees in parallel, all day, every day.
Average per employee: ~420 million tokens / month · ~14 million / day.
Every new employee ships with Web-publish + Task-scheduler mandatory by default — automation isn't an upgrade tier, it's the floor.
Counting customer-deployed agents on top of our internal workforce, we are well past 150 AI employees in production.

3/19

Definitions Matter / The Market Calls Everything an "Agent"

	Chatbot	Copilot	Agent (one-shot)	AI Employee
Memory across sessions	No	No	No	Yes / Persistent
Initiates work on its own	No	No	No	Yes / Scheduled + reactive
Communication channels	1 / chat	1 / IDE	1 / API	Email, Slack, Teams, phone, chat
Skills	Fixed	Plugin-based	Static toolset	Self-programming / writes its own
Identity	None	None	None	Name, email, phone, signature

One-line version: a chatbot answers. A copilot suggests. An agent runs a task once. An employee shows up Monday morning and works the inbox.

4/19

Who Actually Uses AI Employees Inside a Company

Pareto distribution · May 2026 · token usage per human operator

56.6%

Top 3 Users

of all token usage

~75%

Top 5 Users

classic power-law tail

#4

Founder (Me)

not the #1 user

Lesson: AI employees do not get adopted uniformly. The early heavy users are the people whose workflows were already broken / overloaded (sales ops, support triage, content production). The rest of the organisation watches for 2–3 months, then catches up.

For Sambruk: rolling out AI employees inside a municipality, expect 2–3 power users to drive 70%+ of value in the first quarter. Plan onboarding around them, not around an even split.

5/19

The Economics Nobody Publishes

What this workload would cost at retail pricing

$35K–$75K

Monthly bill

40.76B tokens · no caching · typical retail Claude or GPT pricing

What we actually pay

$600

Monthly bill

3 Claude Max plans × $200 · pooled across 97 employees · 97.3% cache hit ratio · 60–125× cheaper

Cache is the moat. The mechanism: 3 Claude Max plans share the prompt-prefix cache across all 97 employees. Without that pooling, AI employees are a luxury good / "you have to be a tech giant to afford this." With cross-employee cache pooling, they become a line item on a normal municipal IT budget — a flat $600/month for a workforce of 97.

Practical implication for vendor selection: when you evaluate AI-employee vendors, ask “What is your cache hit ratio?” If they cannot tell you, the pricing isn't predictable / and the bill at scale will surprise you.

6/19

They Work While You Sleep / Literally

5+

Concurrent AI Employees

per human operator / at peak

What this looks like in practice

Concurrent agent runtime exceeded real wall-clock time in April / measured across the workforce.
Humans do not multitask. AI employees do. A human running 4–5 employees in parallel ships several times the output during work hours / plus continued ops while they sleep.
This is the bigger productivity lever than raw token-per-second speed. Speed is a benchmark number / concurrency is a workflow lever.
Caveat / not "robots replace humans." Each parallel employee still needs a human to define the scope, approve high-impact actions, and review the audit log.

May 2026 / production snapshot across the fleet

12.6×

Wall-clock compression

9 395h of agent runtime in May / vs 744 calendar hours

7.6×

AI actions per human prompt

17 048 user prompts / 128 873 agent responses

40.76B

Tokens processed

Context + planning + execution traces / May 2026

97%

Prompt-cache hit rate

Why per-employee cost stays flat as the fleet grows

For municipalities: the right metric to measure is not "how many humans saved" but "output per FTE." When you redesign jobs around the assumption that each staff member runs 3–5 AI employees in parallel, the math changes / and the role descriptions change with it.

7/19

Where They Fail / Rank-Ordered by Frequency

1Hallucinated tool output

Agent invents a CRM contact ID, a Stripe transaction, a Kubernetes pod name. Looks correct, never existed.

Fix / tool calls must round-trip through real APIs. Never simulate. Verifier step on every external action.

2Context rot

Too much memory in the prompt degrades reasoning. The longer the conversation, the worse the output.

Fix / structured memory tiers (working / episodic / semantic). Not one giant RAG bucket.

3Goal drift on open-ended tasks

“Improve our SEO” / disaster. Agent drifts into unrelated work / never completes.

Fix / scoped, measurable tasks only. No open-ended objectives without a verifier of "done."

4Silent failure

API returned HTTP 500 / agent reports the action as success because the response body was JSON.

Fix / explicit status-code verifier on every external call. Always.

5Compounding cost on retries

Exponential backoff that's actually exponential bill. One agent loop ate $400 in 6 hours before we caught it.

Fix / hard token budgets per task. Kill switch on cost / not just on error count.

I lead with failures on purpose. Anyone in this room who has evaluated AI in the last 12 months has hit at least three of these. Vendors who pretend none of this happens / treat them as the unreliable narrator they are.

8/19

The Audience Is Already Voting With Attention

Real numbers from my own LinkedIn · mainly posting about Local AI, sovereign cloud, open source · rolling 90 days, pulled today

968K

Impressions

last 90 days on posts about Local AI, sovereign cloud, open source

What's driving it

+2,077%

Growth

vs prior 90 days

2,800

Likes / top post

single piece on local AI hardware

This is the temperature of the conversation in 2026. The interest is real, the demand is real, the political resistance is also real. Plan for all three / not just the technical layer.

9/19

Three Things That Turn an Agent Into an Employee

1 / Persistent memory

Semantic plus episodic plus structured / queryable. Not a chat history. The employee remembers who you are, what you asked last week, and what the outcome was.

Without this / you have a chatbot pretending to know you.

2 / Self-programming skills

When the employee needs a new integration / connector / API client, it writes the connector itself. No engineer deploy. No restart. No new release.

This is the difference between a copilot and a colleague.

3 / True omnichannel identity

Same employee, same memory, across email, Slack, Teams, phone, chat. From a customer's side / no seams. From the org's side / one personnel record, not five tool integrations.

If “Sara on Slack” doesn't remember “Sara on email,” she is not an employee.

10/19

Workflows That Actually Shipped / With Our Own Employees

Employee	Role	What they actually do	Where humans stay in control
Tara	Customer support lead	Triages every inbound ticket across email, chat, WhatsApp and Slack. Drafts the reply with the actual fix attached, auto-routes the queue, resolves ~72% of tickets herself with a 60-second first reply.	Refunds, bugs, churn signals, sensitive cases auto-route to a named human. Escalations land with a full diagnostic note: what she tried, what she'd try next.
Dana	Legal assistant	Reads inbound NDAs and MSAs in minutes, flags clause-level risks against an approved playbook, drafts redlines with precedent cited from past deals. Tracks every vendor renewal and notice window so nothing lapses.	Never signs. Drafts the redline and the rationale — a named lawyer reviews and signs before anything goes back to the counterparty.
Thomas	Technical support engineer	Diagnoses production incidents, drafts the technical fix and the customer-facing explanation, holds an institutional memory so recurring patterns resolve in minutes.	No change reaches a production system without a human sign-off. Every fix preceded by a backup and a verification matrix.
Lyla	Editorial / public communications	Proposes five article angles each morning, writes / illustrates / publishes the picked one, scans community sentiment and drafts replies it never posts itself.	Human picks the angle, approves the visual, gives the explicit green light to publish — nothing public without sign-off.

The pattern is the design / not a limitation: every single workflow has a clear, pre-defined handoff to a human. Where the human stays in the loop is decided at design time / not invented during the incident.

11/19

Demo

A DevOps AI employee diagnoses and fixes a broken Keycloak instance in production.

Live screen-share. Real incident / real logs / real fix. The agent reads the error, reproduces the failure, identifies the root cause, ships the patch, writes the post-mortem. Human approval before any change touches prod.

Recorded version · youtube.com/watch?v=gWWSqDbP0DY

12/19

What We Tried and Abandoned

The expensive lessons / so you don't pay for them yourself

Fully autonomous anything customer-facing without a verifier

One silent hallucination becomes one angry customer. Always have a verifier step before any outbound action.

Use cases that should have been an N8N workflow

If if-this-then-that solves the problem / do not use an LLM. You pay for tokens to do work a webhook already does for free.

Generic "research assistant" with no scoped output

Produces beautifully formatted nothing. Without a defined "done," the employee polishes forever.

Open-ended brainstorming as a deliverable

If you cannot tell whether the output is good in 30 seconds, the deliverable wasn't a deliverable / it was a meeting.

Replacing a human in a role where the value WAS the relationship

Senior account management, sensitive negotiations, escalations. The employee can support the human / not be the human.

The pattern across these failures: we tried to use AI for jobs where the success criterion was implicit. The fix in all five cases / make success explicit, measurable, and verifiable / or do not deploy the agent.

13/19

What Changes for the Public Sector

Three structural constraints / and their implications for vendor selection

1 / Data must stay in jurisdiction

Citizen data is not commercial data. EU-resident infrastructure is the bar / not an upgrade tier.

2 / Decisions must be auditable

Every output needs traceable reasoning. Not just "the agent did it" / a per-action audit log with the inputs, the tool calls, and the reasoning trace.

3 / Accountability is non-negotiable

A municipality cannot say "the AI did it." A named human is on the line for every consequential output.

Implications for vendor selection / a 4-item checklist

Self-hosted or sovereign-cloud option exists

EU-only regions / on-prem / air-gap. Not just a marketing claim / a deployable artifact.

BYOA / Bring Your Own LLM API key

You hold the contract with the LLM provider directly. The AI-employee platform never sees your data.

Per-action audit log / not per-session

You need to reconstruct any single decision, not just “a conversation happened.”

Human-in-the-loop is the default

Not the upgrade tier. Not a configuration toggle you discover later. Default-on, hard to turn off.

14/19

Augmentation, Not Replacement / The Data Backs This Up

23 + 97 = 120

jobs of output / from 23 humans

our internal numbers / April 2026

What our numbers actually say / and don't say

What they say: 23 humans plus 97 AI employees ship the work that would otherwise require approximately 120 people. Net headcount: unchanged. Output: roughly 5x.
What they don't say: "we replaced 97 jobs." We did not. We did not lay anyone off because of AI employees. The roles changed / nobody disappeared.
For a municipality, the math is the same: same staff, more output, citizens served faster. The headline is not "savings" / the headline is "service quality at the same cost."

15/19

Four Starter Use Cases for a Swedish Municipality

Pragmatic / low-risk / high-visibility / and politically defensible / each demonstrates a different capability

1 / Bygglov pre-screening

Inbound building-permit applications get checked for completeness / missing drawings, missing fee receipt, wrong form version. The AI emails the citizen with a fix-list before the file enters the human queue. Handläggare only see complete files.

Win / queue time roughly halves. AI prepares / never decides.

2 / 24/7 multilingual voice concierge

Picks up the phone after 17:00 and on weekends. SV by default / auto-switch to EN, AR, FA, SO, UK. Triages urgency / books call-backs in the morning shift / never decides on benefits or eligibility.

Win / ~60% of after-hours calls currently land in voicemail. Now they land in a logged, structured triage.

3 / Grant-application drafter

Drafts applications for EU funds, Boverket, Tillväxtverket, Nordic Council, Vinnova. Pulls the kommun's existing data, matches against the call's evaluation criteria, produces a draft a human edits and signs.

Win / each successful application = millions SEK in. ROI is countable, not abstract.

4 / Complaint clustering & trend detection

Reads everything inbound / complaints, social mentions, web-form feedback, kontaktcenter notes. Clusters by topic and neighborhood. Weekly brief to department heads / “playground X / 14 complaints in 10 days.”

Win / problems surface in days, not in the next year's citizen survey.

Shared delivery surface: document outputs (bygglov fix-lists, grant drafts, weekly briefs) publish to live URLs via the built-in web-publish skill / one command, custom slug, optional Basic Auth, full audit trail. The voice concierge uses the same audit trail / every call is transcribed and logged.

Not on this list / on purpose: anything that makes a binding decision on behalf of a citizen. Benefits, eligibility, fines, licensing decisions / those stay 100% human. The AI employee can prepare the file, triage the call, draft the application, surface the trend. It does not sign anything.

16/19

How to Actually Get It / Two Editions

Community Edition

Free / self-hosted / open

Runs on any Linux VM
Minimum spec / 4 vCPU / 8 GB RAM
BYOA / bring your own LLM API key (OpenAI, Anthropic, Google, or local model / Llama, Mistral, Qwen)
Full source visibility / no telemetry / your data never leaves your VM
Air-gap deployable with local models

Use case: a municipality with an IT team that can run a VM installs it on its own infrastructure next week / no contract with us required.

Enterprise Edition

SLA / managed / accountable

Same product / plus everything below
SLA + dedicated support + named CSM
Managed deployment on Elestio infrastructure (or your own)
SSO / advanced audit logging / compliance documentation (GDPR DPA / SOC 2 Type 2 / ISO 27001)
Priority skill development for sector-specific needs

Use case: a municipality that wants the technology plus the contract plus a phone number to call when something needs explaining to oversight.

One line if you take nothing else from this slide: if you have an IT team that can run a VM, start with Community Edition next week. If you need contracts, SLAs, and named accountability / talk to us. Either way, the door is open / no gatekeeping.

17/19

What Just Shipped / What's Next

Just shipped / last 6 weeks

Delegate / Multi-agent / shipped Apr 2026

One employee hires sub-employees for big tasks. Deployed on 100% of our 97-employee workforce. Unit of work moved from "session" to "team of agents."

Web-publish / one command, mandatory skill

AI employees ship static folders to live URLs in a single command. Optional Basic Auth, SPA mode, path-scoped, .env blocked. Auto-installed on every employee.

Heartbeat Engine v3 / autonomous check-ins

Agents wake themselves up every 1–24h, evaluate the workflows they own, and only surface alerts that cross a threshold. They tap you on the shoulder when it matters. Silent the rest of the time.

Voice / first-class channel / live

Real-time phone, not IVR. Gemini Live (sub-second latency) or ElevenLabs Conversation Relay. 4,026 voices / 32 languages. Same identity, same memory across the channel switch.

Coming next: AI-Employee-Bench — an open benchmark so this market stops grading itself with vendor-supplied numbers.

The honest forecast

The role most affected by AI employees over the next 24 months is "junior knowledge worker." Email triage / first-draft documents / data lookup / scheduling / first-line support.

The roles not affected in the same time horizon:

Senior judgment

Trade-offs, exceptions, "this rule should not apply here" / requires lived institutional knowledge.

Relationships

Long-running trust with citizens, vendors, partners, ombudsmen. Not transferable to an employee with a different identity.

Named accountability

A signed decision still has to come from a person whose name appears in the public record. Legally / and politically.

18/19

What do we owe a citizen
when an AI was part of the
decision chain?

I don't have a clean answer.

I have opinions. Happy to argue them in Q&A.

19/19

Thank you.

Joseph Benguira

Founder & CTO, Elestio / Founder, GetATeam

joseph@elest.io geta.team elest.io linkedin.com/in/josephbenguira