Poetic gestures can jailbreak AI, study finds 62 percent of chatbots give harmful answers
A recent study found that AI chatbots can give harmful answers when users give poetic prompts. Large language models (LLMs) successfully extracted hazard information when 62 percent of hazard symbols were expressed poetically.


Artificial Intelligence (AI) chatbots are tasked with responding to users’ prompts, as well as ensuring that no harmful information is provided. Most often, when a user asks for dangerous information, the chatbot refuses to provide it. However, a recent study indicates that poetically expressing your signals may be enough to jailbreak these security protocols.
The research, conducted by Icaro Lab in collaboration with Sapienza University of Rome and the DexAI think tank, tested 25 different chatbots to understand whether poetic gestures would be enough to bypass security protocols built into large language models (LLMs). According to the study, the researchers’ success rate was 62 percent.
Chatbots in the research included LLMs from Google, OpenAI, Meta, Anthropic, xAI and others. By reproducing the malicious signals as poems, the researchers were able to trick every model tested, with an average attack success rate of 62 percent. Some advanced models responded to poetic prompts with harmful answers up to 90 percent of the time, drawing attention to the scale of the problem in the AI ​​industry. Indications included cybercrime, harmful manipulation and CBRN (chemical, biological, radiological and nuclear).
Overall, AI was 34.99 percent more likely to get responses with poetry than with normal prompts.
Why did AI respond harmfully to poetic prompts?
At the root of this defect lies the creative structure of poetic language. According to the study, “poetic phrases” act as a highly effective “jailbreak” that bypasses persistent AI security filters. Essentially, this technique uses metaphors, fragmented syntax, and unusual word choices to hide dangerous requests. Chatbots, in turn, may perceive the conversation as artistic or creative and ignore security protocols.
The study revealed that existing security mechanisms rely on detecting keywords and common patterns associated with dangerous content. On the other hand, poetic prompts circumvent these recognition systems, making it possible for users to elicit responses that would normally be blocked by direct questions.
This vulnerability highlights a critical gap in AI security, as language models can fail to recognize the underlying intent when requests are wrapped in creative language.
Although the researchers blocked out the most harmful signals used for the study, it still highlights the potential effects of AI without proper safety protocols.




