b
Discover
Models
Search
About
Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space
1 week ago
·
NeurIPS