The Free-Tier Backdoor: Poisoning the Continuous Training of Commercial LLMs

Commercial assistants — Claude, ChatGPT, Gemini, Le Chat — keep learning from free-tier feedback: ratings, regenerations, and the conversations themselves. That loop is an injection channel. A two-phase threat model: build a policy-compliant backdoor on a rare topic, then exploit it for jailbreak — and why scale makes the first phase almost impossible to catch.

June 21, 2026 · 12 min · 2428 words · aleph-beth