<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Jailbreak on AI Watchtower</title>
    <link>https://aleph-beth.github.io/AI-Watchtower/tags/jailbreak/</link>
    <description>Recent content in Jailbreak on AI Watchtower</description>
    <generator>Hugo</generator>
    <language>en-US</language>
    <lastBuildDate>Mon, 29 Jun 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://aleph-beth.github.io/AI-Watchtower/tags/jailbreak/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>When AI Takes Action: Understanding Attacks on Autonomous Agents, and How to Defend Against Them</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-29-ai-agent-security-jailbreak-countermeasures/</link>
      <pubDate>Mon, 29 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-29-ai-agent-security-jailbreak-countermeasures/</guid>
      <description>A chatbot writes sentences; an AI agent acts — it reads your email, runs code, calls APIs, spends money. That shift moves the risk: it is no longer about making the AI say something forbidden, but about making it do something dangerous. This article explains, with detailed and accessible examples, how these attacks actually work, why naive guardrails fail, and what a decision-maker must demand before putting an agent into production.</description>
    </item>
    <item>
      <title>The AI War on Our Networks: Why Attack Outpaces Defense</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-12-ai-war-on-our-networks-why-attack-outpaces-defense/</link>
      <pubDate>Fri, 12 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-12-ai-war-on-our-networks-why-attack-outpaces-defense/</guid>
      <description>Strategic essay. Cyber conflict is now machine-versus-machine, at a tempo that excludes the human operator. Attack holds the advantage — by architecture, not by accident: defending one LLM with another reproduces the very flaw. The way out is to move the decision out of the model, into a deterministic layer.</description>
    </item>
  </channel>
</rss>
