Dex
7 min readBy Dean Craftsman

The Audit Trail Is the Product: Why 'Governed Autonomy' Is the Question CIOs Should Actually Ask

Skeptical CIOs ask 'can I trust autonomous IT?' The sharper question is 'can I prove what it did?' Auditability is what makes autonomy defensible.

Every CIO evaluating autonomous IT opens with the same question: "Can I trust it?" It's the right instinct and the wrong question. Trust isn't a property you can verify in a demo, and no board has ever been satisfied by a vendor promising to be trustworthy. The question that actually holds up in front of a risk committee is narrower and far more useful: "Can I prove what it did?" This post reframes trust as auditability - and walks the policy-match → action → log chain that turns autonomy from a leap of faith into a documented control you can defend.

Trust is a feeling. Auditability is evidence.

When a senior engineer resets a password or changes a group membership, nobody in the org "trusts" them in the abstract. There's a permission model that says what they're allowed to touch, and there's a log that records what they touched. Trust is the shorthand; the audit trail is the substance. Take the log away and the most trustworthy engineer on your team becomes an unacceptable risk, because you've lost the ability to reconstruct what happened.

Autonomous IT is held to exactly this standard - and it should be. The mistake is thinking autonomy raises the bar. It doesn't; it just makes the existing bar impossible to skip. A human engineer can forget to log a change, or log it vaguely. Dex, the autonomous IT engineer for Microsoft 365, can't: the record is written by the same execution layer that performs the action, so there is no path where the work happens and the log doesn't. That's the shift CIOs should be pressing on. Not "how confident are you in the AI?" but "show me the record it produces, and show me it can't be turned off."

The chain that makes autonomy defensible

Governed autonomy isn't a slogan; it's a specific sequence, and every link matters. Dex investigates the request, plans the change, and executes it - but the execution is wrapped in a control chain that produces evidence at each step.

A decision flow showing governed autonomy: a request hits a policy check, matches proceed to execution while non-matches halt, MFA and admin-role guardrails act as hard stops, and every action writes an immutable audit log entry.

1. Policy match. Before anything executes, the requested action is checked against an explicit, structured policy set. Dex runs a six-layer model - Global → Tenant → Target Rules → Department → Action → Runtime - and the rule is absolute: no policy, no action. This is enforced in the execution layer, not in the model's prompt, which is the difference between a guardrail and a suggestion. A request that matches proceeds. A request that doesn't match halts and escalates to a human with full context attached.

2. Hard-stop guardrails. Two constraints sit underneath every policy as non-negotiable floors: Dex never grants admin roles, and never bypasses MFA. Because they're coded rather than prompted, no cleverly worded request - and no prompt-injection attempt - can argue its way past them. Actions run under delegated permissions, meaning Dex acts as the requesting user or admin using their own access, never a shared, broadly-scoped API key.

3. Immutable log. Every action - and every refusal - writes an audit entry capturing who asked, which policy authorized it, what was executed, and the result. The entry lands in two places: your native Microsoft 365 logs (Entra ID, Exchange, SharePoint) and Dex's own Activity Log. You can reconcile one against the other.

Miss any link and the story falls apart. Policy without logging is unprovable. Logging without policy is a post-mortem tool, not a control. The chain is the product.

Why "the log is written by the doer" matters more than it sounds

There's a subtle but decisive distinction between an AI that narrates what it did and a system that records what it did. A chatbot can tell you, in fluent prose, that it reset a user's password - whether or not it actually did. If the log is generated by the language model describing its own behavior, the log is only as reliable as the model's account of itself. That's not an audit trail; it's a testimony.

Dex's logs are written by the execution layer that makes the API calls, not by the model reflecting on them. The entry exists because the action fired against Entra ID or Exchange, and it carries the backend's own response. This is why the native M365 logs matter as a second source: they're generated by Microsoft, independent of Dex, and they should line up with Dex's Activity Log entry for the same event. When two independently-produced records agree, you have evidence. When you only have the AI's self-report, you have a claim. Auditors know the difference, and so do boards.

The compliance posture buyers keep raising

Enterprise buyers don't ask about certifications to collect logos for a slide. They ask because a certification is a third party attesting that controls exist and were tested. Dex is built on a stack certified to ISO 27001, ISO 27017, and ISO 27018, is SOC 2 Type 2 compliant, is GDPR compliant, and runs HIPAA-aligned controls. SOC 2 Type 2 in particular is worth naming, because it certifies that controls operated effectively over a period of time - not that they existed on the day of the audit.

Autonomy doesn't relax any of this; it runs inside it. A few architectural choices do the heavy lifting:

  • Per-org isolated databases and encryption keys - one tenant's data and one tenant's audit trail never share storage with another's.
  • Delegated permissions - actions execute under existing, already-governed identities, so your access model and your audit trail stay authoritative.
  • Zero data retention - Dex reads only what a task requires and discards it, so the audit trail records the action, not a growing copy of your M365 data.

The certifications tell an auditor the controls are real. The per-action log tells them what the controls actually did on Tuesday at 2:47 p.m.

What to ask a vendor - and what a good answer sounds like

If you're evaluating any autonomous or "agentic" IT tool, the questions that separate a defensible system from a demo aren't about accuracy percentages. They're about evidence. We wrote a fuller version of this in our piece on whether you can trust AI to run IT; here's the short list for a procurement conversation:

  1. "Where do your guardrails live - in the prompt or in the code?" Prompt-level rules are bypassable by design. If the answer is "we instruct the model not to," that's not a control.
  2. "Show me a single action's audit record end to end." You want to see the requester, the authorizing policy, the executed change, and the outcome - and you want to reconcile it against the native M365 log.
  3. "What happens when a request matches no policy?" The correct answer is: nothing executes, and it escalates with context. A system that improvises here is a system that can surprise your auditor.
  4. "Can the logging be turned off?" If the log is a feature that can be disabled, it isn't a control. It has to be structural.

A vendor whose autonomy story is "our AI is really good" is selling you trust. A vendor who can walk the policy → action → log chain on a real transaction is selling you auditability. Only one of those survives a board meeting.

The reframe, in one line

CIOs are right to be skeptical - the market has earned that skepticism with a decade of "AI for IT" demos that suggested more than they shipped. But skepticism aimed at "can I trust it?" never resolves, because trust isn't verifiable. Aim it at "can I prove what it did?" and the question becomes answerable, because provability is a design property you can inspect. Governed autonomy is the answer: every action matches a policy, hard-stop guardrails fail closed, and the immutable log makes the whole thing defensible after the fact. Dex resolves the L1-through-L3 work that used to consume your team - and it produces the paper trail that lets you stand behind every action it took.

The audit trail isn't the compliance tax you pay for autonomy. It's the thing that makes autonomy safe to deploy at all.

Frequently asked

What is governed autonomy in IT?
Governed autonomy means an autonomous IT engineer executes real work - password resets, group access, license changes, deeper Tier 2 and Tier 3 troubleshooting - but only when the requested action matches an explicit, structured policy. No matching policy means no action. Every action that does run is written to an immutable audit log, so the system produces a defensible record of exactly what it did and why. It's autonomy bounded by policy and provable after the fact, not autonomy on trust.
How do you audit what an AI IT agent actually did?
You audit it the same way you'd audit a human engineer with admin rights, except the record is generated automatically and can't be skipped. Dex writes every action to two places: the native Microsoft 365 logs (Entra ID sign-in and audit logs, Exchange and SharePoint activity) and its own Activity Log. Each entry captures who requested the change, which policy authorized it, what was executed against the backend, and the outcome. Because the logs are written by the execution layer rather than narrated by the model, they reflect what happened, not what the AI claims happened.
Does autonomous IT meet enterprise compliance requirements?
Dex is built on a stack certified to ISO 27001, ISO 27017, and ISO 27018, and SOC 2 Type 2 compliant, with GDPR compliance and HIPAA-aligned controls. Autonomy doesn't loosen those controls - it operates inside them. Delegated permissions mean actions run as the requesting user or admin under their existing access, per-org isolated databases and encryption keys keep tenant data separated, and a zero-data-retention model means Dex reads only what a task needs and discards it.
Can an autonomous IT engineer grant itself admin rights or bypass MFA?
No. Two guardrails are hard-coded at the execution layer, not written as prompt instructions: Dex never grants admin roles and never bypasses MFA. Because these are enforced in code rather than in the model's prompt, a prompt-injection attempt or a malformed request can't talk its way past them - the action simply never executes, and the refusal is logged like any other decision.
Is the audit trail enough to defend autonomy to a board or auditor?
That's the point of framing trust as auditability. A board doesn't need to believe an autonomous system is infallible; it needs evidence of what the system is permitted to do, proof of what it actually did, and assurance that out-of-policy actions are blocked. The policy-match to action to immutable-log chain gives you all three: a documented policy set, a complete per-action record reconcilable against native M365 logs, and code-level guardrails that fail closed. That's a defensible control narrative, not a leap of faith.