Prompt Injection


Prompt injection vulnerabilities can pose significant risks to language models and chatbots, which rely heavily on user input.

  • risks include:
    • manipulating the language model’s behavior
    • data theft or leakage
    • bias and misinformation
  • attacker can inject malicious code or commands into the chatbot’s input field,
    • causing it to respond with:
      • inappropriate or harmful messages
      • or take actions that compromise privacy or security
  • can insert biased or false information into the language model’s training data
    • lead to biased or inaccurate models that produce misleading or harmful results