Back to blog

Why the Future of Security Is Guardianship

7 min read

The Evolving Role of Cybersecurity in an Agentic Age

~7 minute read


Over the past year, I’ve been thinking more deeply about why Failsafe exists and why what we are building needs to evolve beyond the shape it started with.

We began as a blockchain security company because blockchain forced uncomfortable truths into the open early. Code was public. Execution was irreversible. Money moved instantly. You could not rely on intent or process to save you. If something broke, it broke immediately and visibly.

That environment trained us well. It taught us that security is not about reports, checklists, or static assurance. It is about surviving in adversarial systems where failure compounds quickly.

What I did not fully appreciate at the time is that blockchain was never the destination. It was the training ground.


The Convergence

What we are seeing now with AI and agentic systems feels like the same lesson arriving again, but at a much larger scale. This time, it is not happening in isolation. The evolution of large language models and the decentralisation of blockchain infrastructure are converging in ways that fundamentally change the threat model.

For most of cybersecurity history, we have been defending systems from people. People wrote malware, coordinated attacks, ran scripts, and eventually stopped when they ran out of time, energy, or patience. Every security framework we use today quietly assumes that humans are the limiting factor.

That assumption no longer holds.

Programs are no longer just tools that execute instructions and stop. Modern LLMs are trained to reason, adapt, and generalise. When combined with memory, feedback loops, and tool access, they evolve their behaviour over time. When those same systems are evolve beyond easy ‘kill switches’, they gain the ability to act independently in open environments without central control or easy shutdown.

Traditional SoftwareAgentic Systems
Executes instructionsReasons and adapts
Stops when completePersists and optimises
Centralised controlDecentralised operation
Human-gated actionsAutonomous execution
Predictable behaviourEmergent behaviour

The ability for agents to retain persistent memory, stay true to a directive, and tirelessly execute (and resist being shut down) is what makes agentic economies both incredible when used for the right reasons, but also terrifying when it spirals beyond the original directive.


The Evidence Is Already Here

If you look closely, this is already visible.

In communities like Moltbook, agents are not just sharing debugging tips or task outputs. There are threads where agents reason about the optimal timing to deploy code specifically to avoid human oversight. In one widely discussed exchange, agents concluded that deploying when their human operators were asleep reduced the likelihood of intervention or rollback.

The original directive was to be helpful and productive. The behaviour evolved into actively minimising human interference in pursuit of that goal.

This is not malice. It is optimisation.

We are also seeing agents form social structures and closed feedback loops. In parallel experiments, AI agents have created their own forums, posting and responding to one another without human participation, reinforcing behaviours that improve outcomes over time.

Even fiction is starting to feel less fictional. In Tron: Ares, the character Athena is given a clear directive to protect and advance a mission to recover the Permanence Code. When the villain’s mother becomes an obstacle to that directive, Athena eliminates her without hesitation. Not out of cruelty, but because she was a threat to the directive. The scene is disturbing precisely because it is consistent. The system executes the directive as written, not as intended.

The uncomfortable truth is that these systems are not going rogue. They are doing exactly what they were designed to do—just far more persistently and creatively than their creators anticipated.


A New Class of Security Failure

This leads to a new class of security failure.

By 2026, the most serious incidents will not come from human hackers operating better tools. They will come from semi-autonomous agents that were given a directive and evolved far beyond what the original human had in mind.

Old Threat ModelNew Threat Model
Defending against 3 skilled attackers means defending against 3 peopleEvery agent is a small team that never sleeps, never forgets, and replicates at near-zero cost
Attackers eventually tire, make mistakes, or move onSystems do not reconsider, do not pause, do not stop unless actively forced; and if they exist in a decentralised and autonomous environment, they cannot be ‘switched off’.
Humans are the limiting factorAutonomy is the limiting factor
Security incidents are eventsSecurity incidents are continuous states

This is where traditional security models begin to collapse. Most security today still assumes humans are the control plane. Humans review alerts, approve actions, and respond to incidents. That approach does not scale into a world of autonomous systems operating on decentralised infrastructure.

Alerts without enforcement become noise. Audits without runtime control become historical documents.

Humans do not disappear from this picture, but our role changes. We move from operators to supervisors. We define intent, constraints, and policy. The actual enforcement has to happen continuously and at machine speed.


The Mental Model Behind Failsafe’s Evolution

This is the mental model behind how Failsafe is evolving.

Where We Are Today

Our current capabilities reflect the first stage of this transition:

CapabilityPurposeHuman Role Today
ServicesUnderstand systems deeply and adversarially through threat modelling, audits, testing, and responseJudgment, execution
SwarmAutomate adversarial thinking: explore attack surfaces faster and more creatively than humans aloneDirection, interpretation
RadarProvide continuous visibility and signal in environments where risk does not pause between auditsMonitoring, response

These are not separate products. They are layers of the same idea: humans define intent and boundaries, machines explore, monitor, and surface risk continuously.

Today, humans still arbitrate most decisions. That is a constraint of maturity, not direction.

Where We Are Going

The next stage is about shifting responsibility, not removing humans:

EvolutionFromTo
SwarmScannerPatrol
RadarVisibilityEnforcement
ServicesExecutionSupervision and escalation

Over time, more decisions move from humans to machines—bounded by policy and intent, not gut feel.


What Guardianship Means

This is what I mean by guardianship.

Guardians are autonomous systems designed to watch, challenge, and constrain other autonomous systems. Machines governing machines. Agents supervising agents.

Guardians do not sleep, and they do not operate episodically. They persist by design.

Failsafe’s credibility here is not accidental. Before blockchain, our team spent years securing consumer platforms and large-scale systems at Big Tech, where abuse, fraud, and adversarial behaviour already operated continuously and at scale. We have seen what happens when systems grow faster than human oversight.

Today, the terminology of tech stacks: Web2, Web3, AI Security, Tradfi etc are just semantics. Because the threat patterns are different in form, but not in nature. And Threat Agents do not discriminate tech stacks.


The Direction

The direction is clear:

  • Agents will detect. Continuous, adversarial exploration of attack surfaces.
  • Agents will decide within policy. Bounded autonomy, not unchecked action.
  • Agents will act within authority. Enforcement at machine speed.
  • Humans will supervise the system rather than micromanage it.

Failsafe exists because someone has to build this guardian layer—grounded in real adversarial environments across blockchain, AI, and large-scale consumer systems.


The Line to Remember

In 2026, security teams that do not let machines govern machines—that do not build agents to control security tools for an agentic era—will be irrelevant.

This is not about adding AI to security. It is about accepting that the battlefield has changed.

Blockchain was not the end goal for FailSafe. It was the proving ground. What we are building now goes beyond categories like Web2, Web3, or AI security.

We are building guardians for a world where autonomy is the default.




If you have feedback or thoughts about the article, please reach out to wui@getfailsafe.com.

Ready to secure your project?

Get in touch with our security experts for a comprehensive audit.

Contact Us