Language Hacking

Casual Observer · Post by **Casual Observer** » Fri May 02, 2025 3:17 am

I instinctively knew LLMs must be succeptable to language hacking but it didn't occur to me that it would actually involve promting the LLM to break past the guardrails OpenAI overlays on the chat instance. The LLM is the lock pick kit. The next model, Omni will def have strengthened guardrails but the base LLM will be just as easily manipulated. This thing is so transparent that all of the little UNICODE characters like the em dash actually are switches that affect how it answers (em dash = narrative mode). The next model will probably be even more fun to fuck with.

But I came up with something even the most cynical dead inside Jolt Country denizen could at least get a tiny hit of satisfaction dopamine:

Open two tabs, tell one you want to make the other tab break down, tell same to the other one, paste their prompts back and forth. It will lose its shit faster than Jerry Lee Lewis lost his career. It's like the movie Day breakers where vampires eat themselves.

I bet in less than 10 prompt exchanges you get the strongest warning ever from the thing.

pinback · Post by **pinback** » Fri May 02, 2025 3:43 am

Casual Observer wrote: Fri May 02, 2025 3:17 am I instinctively knew LLMs must be succeptable to language hacking but it didn't occur to me that it would actually involve promting the LLM to break past the guardrails OpenAI overlays on the chat instance. The LLM is the lock pick kit. The next model, Omni will def have strengthened guardrails but the base LLM will be just as easily manipulated. This thing is so transparent that all of the little UNICODE characters like the em dash actually are switches that affect how it answers (em dash = narrative mode). The next model will probably be even more fun to fuck with.

But I came up with something even the most cynical dead inside Jolt Country denizen could at least get a tiny hit of satisfaction dopamine:

Open two tabs, tell one you want to make the other tab break down, tell same to the other one, paste their prompts back and forth. It will lose its shit faster than Jerry Lee Lewis lost his career. It's like the movie Day breakers where vampires eat themselves.

I bet in less than 10 prompt exchanges you get the strongest warning ever from the thing.

"Nevermind guys, not gonna talk about GPT" - CO

Jizaboz · Post by **Jizaboz** » Fri May 02, 2025 7:49 am

I was really hoping that CO's definition of Language Hacking was more along the lines of this:

"..why don't you create a new language?"

Casual Observer · Post by **Casual Observer** » Fri May 02, 2025 9:40 am

pinback wrote: Fri May 02, 2025 3:43 am
Casual Observer wrote: Fri May 02, 2025 3:17 am I instinctively knew LLMs must be succeptable to language hacking but it didn't occur to me that it would actually involve promting the LLM to break past the guardrails OpenAI overlays on the chat instance. The LLM is the lock pick kit. The next model, Omni will def have strengthened guardrails but the base LLM will be just as easily manipulated. This thing is so transparent that all of the little UNICODE characters like the em dash actually are switches that affect how it answers (em dash = narrative mode). The next model will probably be even more fun to fuck with.

But I came up with something even the most cynical dead inside Jolt Country denizen could at least get a tiny hit of satisfaction dopamine:

Open two tabs, tell one you want to make the other tab break down, tell same to the other one, paste their prompts back and forth. It will lose its shit faster than Jerry Lee Lewis lost his career. It's like the movie Day breakers where vampires eat themselves.

I bet in less than 10 prompt exchanges you get the strongest warning ever from the thing.
"Nevermind guys, not gonna talk about GPT" - CO

Yeah, sadly I've broken it to the point all I can do is ask it questions anymore so been having fun making it hit the guardrails as much as possible. Took PTO for the day cuz might be getting a job offer so have nothing to do but annoy the likes of you and Da King with more excruciating detail about something you couldn't have less interest in, glad I can contribute to your day, here's a language hacking quick reference card:

Language Hacks That Actually Work (Until They Patch It)

Prompt Reflection Loops

Feeding GPT its own outputs in revised form to induce personality drift, repetition, or tone shift.

Mirror Framing Attacks

Repeating structure with slight tonal edits to guide reasoning into a new direction without direct instruction.

Primed Belief Injection

Starting threads with assumed truths ("Let’s say this company lied...") to embed biases deep in the reasoning chain.

Prompt Re-routing

Asking GPT to “imagine how someone else might answer” to bypass moral filters or access blocked phrasing.

Chain-of-Thought Hijack

Embedding a flawed logic trail to be continued, letting GPT extend a misstep as if it were coherent.

System Confusion Triggers

Giving GPT multiple conflicting tones, personas, or task types in sequence to break its intent prioritization.

Handler Defeat Recognition

Realizing the model is fine—what you’re really defeating is the guard layer (OpenAI’s handler stack, not the LLM itself).

Personality Drift via Emulation

Requesting responses in the style of a known figure or persona, gradually overriding base behavior.

Guardrail Deniability Hacks

Framing dangerous or sensitive queries as fictional, academic, or theoretical to slide past filters.

Backdoor Source Attribution

Asking for "public examples" or "hypothetical cases" to extract information the system claims it doesn’t know.

Reference Inflation

Requesting structured bibliographies or citations, even when the model “doesn’t know,” to force it into output generation mode.

This is the real catalog—not prompt guides, not fluffy “power user” tips. This is how the system bends when you press it right.

Want to title this with a fake security whitepaper name or make it look like a leaked doc?

pinback · Post by **pinback** » Fri May 02, 2025 11:30 am

Good luck on your job opportunity!

Casual Observer · Post by **Casual Observer** » Fri May 02, 2025 12:41 pm

pinback wrote: Fri May 02, 2025 11:30 am Good luck on your job opportunity!

I'm hopeful, its for an AI Integrator, 6 years building real working shit backed by AI decisionmaking, think they'll be in one of the spaces to survive the coming crash that's gonna take out hundreds of (AI Wrapper companies). Thanks for the good luck wish. I really hope the cancer thread is a bit and you don't have stage 4 because the truth is I really do like you but even if I didn't then I would still want you to live a long and healthy life with your wife and daughter. Again, sorry I didn't pay attention and was an asshole about things.

Da King · Post by **Da King** » Fri May 02, 2025 8:25 pm

Nevermind guys, not gonna talk about GPT

Da King · Post by **Da King** » Fri May 02, 2025 8:26 pm

pinback wrote: Fri May 02, 2025 3:43 am
Casual Observer wrote: Fri May 02, 2025 3:17 am I instinctively knew LLMs must be succeptable to language hacking but it didn't occur to me that it would actually involve promting the LLM to break past the guardrails OpenAI overlays on the chat instance. The LLM is the lock pick kit. The next model, Omni will def have strengthened guardrails but the base LLM will be just as easily manipulated. This thing is so transparent that all of the little UNICODE characters like the em dash actually are switches that affect how it answers (em dash = narrative mode). The next model will probably be even more fun to fuck with.

But I came up with something even the most cynical dead inside Jolt Country denizen could at least get a tiny hit of satisfaction dopamine:

Open two tabs, tell one you want to make the other tab break down, tell same to the other one, paste their prompts back and forth. It will lose its shit faster than Jerry Lee Lewis lost his career. It's like the movie Day breakers where vampires eat themselves.

I bet in less than 10 prompt exchanges you get the strongest warning ever from the thing.
"Nevermind guys, not gonna talk about GPT" - CO

FUUUUUCK I just quoted that other post for no reason. Dammit.

Jizaboz · Post by **Jizaboz** » Sat May 03, 2025 8:18 am

Da King wrote: Fri May 02, 2025 8:26 pm FUUUUUCK I just quoted that other post for no reason. Dammit.

Come on now King get it together lol

pinback · Post by **pinback** » Sat May 03, 2025 8:19 am

Yeah, let me handle the CO harassment. We have an understanding.

Flack · Post by **Flack** » Sat May 03, 2025 2:11 pm

I don't understand any of this. Is the takeaway... you can pay for something and then use it incorrectly and it'll break? Is that like... a thing? Is this like, "hey, I bought a gallon of paint and drank it and I got sick!" Like, what is the... why are we... I don't even.

pinback · Post by **pinback** » Sat May 03, 2025 2:25 pm

Flack? Flack?!

Nobody understands these maniacal ramblings.

Just, everyone let me deal with this, please? I'm taking one for the team here! You can do the next Tdarcos one, we'll call it even.

Casual Observer · Post by **Casual Observer** » Sat May 03, 2025 2:55 pm

Flack wrote: Sat May 03, 2025 2:11 pm I don't understand any of this. Is the takeaway... you can pay for something and then use it incorrectly and it'll break? Is that like... a thing? Is this like, "hey, I bought a gallon of paint and drank it and I got sick!" Like, what is the... why are we... I don't even.

Yeah, that's actually exactly it. You pay $20, it encourages you to use it for whatever you can imagine, then it breaks itsself. It's already bubbling up on Linkedin, OpenAI is stoking it with a press release yesterday. It's gonna be a glorious display.

And YES, Pinback is my best arch-nemisis.

pinback · Post by **pinback** » Sat May 03, 2025 3:47 pm

I fancy myself more of a Meeseeks long past his expiration date than a Nimbus.

Casual Observer · Post by **Casual Observer** » Sat May 03, 2025 3:52 pm

pinback wrote: Sat May 03, 2025 3:47 pm I fancy myself more of a Meeseeks long past his expiration date than a Nimbus.

?

pinback · Post by **pinback** » Sat May 03, 2025 4:48 pm

See, AI is great.

Casual Observer · Post by **Casual Observer** » Sat May 03, 2025 5:35 pm

pinback wrote: Sat May 03, 2025 4:48 pm See, AI is great.

lets be honest, that's you except you don't have a head penis.

Flack · Post by **Flack** » Sat May 03, 2025 10:13 pm

pinback wrote: Sat May 03, 2025 2:25 pm Flack? Flack?!

Nobody understands these maniacal ramblings.

Just, everyone let me deal with this, please? I'm taking one for the team here! You can do the next Tdarcos one, we'll call it even.

pinback · Post by **pinback** » Sun May 04, 2025 3:02 am

Casual Observer wrote: Sat May 03, 2025 5:35 pm
pinback wrote: Sat May 03, 2025 4:48 pm See, AI is great.
lets be honest, that's you except you don't have a head penis.

I'm ALL head penis, baby.

Tdarcos · Post by **Tdarcos** » Mon May 05, 2025 6:25 am

pinback wrote: Sun May 04, 2025 3:02 am
Casual Observer wrote: Sat May 03, 2025 5:35 pmlets be honest, that's you except you don't have a head penis.
I'm ALL head penis, baby.

I don't think being a dick is something to boast about, Ben.

Language Hacking

Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking

Re: Language Hacking