<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Prompt-Injection on AI Watchtower</title>
    <link>https://aleph-beth.github.io/AI-Watchtower/tags/prompt-injection/</link>
    <description>Recent content in Prompt-Injection on AI Watchtower</description>
    <generator>Hugo</generator>
    <language>en-US</language>
    <lastBuildDate>Mon, 29 Jun 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://aleph-beth.github.io/AI-Watchtower/tags/prompt-injection/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>The Instruction That Protects Nothing: Why Prompt Position and Fine-Tuning Never Validate an LLM</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-29-prompt-position-fine-tuning-never-validate/</link>
      <pubDate>Mon, 29 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-29-prompt-position-fine-tuning-never-validate/</guid>
      <description>A stubborn intuition holds that you only need to put the safety rules &amp;lsquo;first&amp;rsquo; in the system prompt. It is false, and for a reason that turns against it: a transformer grants no authority to a token&amp;rsquo;s position. Fine-tuning fails exactly the same test. Neither is an access control — both live inside the very thing they claim to constrain. The only guarantee is deterministic and external, and a rigorous dataset must reflect that boundary in its labels.</description>
    </item>
    <item>
      <title>When AI Takes Action: Understanding Attacks on Autonomous Agents, and How to Defend Against Them</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-29-ai-agent-security-jailbreak-countermeasures/</link>
      <pubDate>Mon, 29 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-29-ai-agent-security-jailbreak-countermeasures/</guid>
      <description>A chatbot writes sentences; an AI agent acts — it reads your email, runs code, calls APIs, spends money. That shift moves the risk: it is no longer about making the AI say something forbidden, but about making it do something dangerous. This article explains, with detailed and accessible examples, how these attacks actually work, why naive guardrails fail, and what a decision-maker must demand before putting an agent into production.</description>
    </item>
    <item>
      <title>When the Guards Are Agents Too: The Recursive Corruption of Control Systems</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-30-recursive-corruption-of-ai-control-systems/</link>
      <pubDate>Mon, 29 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-30-recursive-corruption-of-ai-control-systems/</guid>
      <description>Classic security tools hunt for dangerous words: &amp;lsquo;hack&amp;rsquo;, &amp;lsquo;bomb&amp;rsquo;, &amp;lsquo;urgent&amp;rsquo;. But you don&amp;rsquo;t subvert an AI agent with suspicious vocabulary — you subvert it with the ordinary language of the business: a role, a process, a plausible emergency. And when the agent that monitors, the SIEM that correlates and the auditor that checks are themselves AI agents, the attacker no longer has to defeat a system: it corrupts them in a chain. This article explains that recursive-corruption mechanism and what a decision-maker must demand to break it.</description>
    </item>
    <item>
      <title>The AI War on Our Networks: Why Attack Outpaces Defense</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-12-ai-war-on-our-networks-why-attack-outpaces-defense/</link>
      <pubDate>Fri, 12 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-12-ai-war-on-our-networks-why-attack-outpaces-defense/</guid>
      <description>Strategic essay. Cyber conflict is now machine-versus-machine, at a tempo that excludes the human operator. Attack holds the advantage — by architecture, not by accident: defending one LLM with another reproduces the very flaw. The way out is to move the decision out of the model, into a deterministic layer.</description>
    </item>
    <item>
      <title>The Agentic SOC — and the Attacks Against Defensive AI Agents</title>
      <link>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-10-agentic-soc-attacks-on-defensive-ai/</link>
      <pubDate>Wed, 10 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://aleph-beth.github.io/AI-Watchtower/posts/2026-06-10-agentic-soc-attacks-on-defensive-ai/</guid>
      <description>Two linked shifts: the SOC moves from a human craft model to an automated agentic one — and those same defensive agents become a new attack surface. The defense you deploy is also the breach you open.</description>
    </item>
  </channel>
</rss>
