ASSET Seminar: “Robustness in the Era of LLMs: Jailbreaking Attacks and Defenses”

Raisler Lounge (Room 225), Towne Building 220 South 33rd Street, Philadelphia

Abstract: Despite efforts to align large language models (LLMs) with human intentions, popular LLMs such as chatGPT, Llama, Claude, and Gemini are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into […]