I'm looking for the relevant CWE's for specific attacks against prompt-based language ML models, such as GPT-3, GPT-4 etc.
Specifically:
- Prompt Injection: Amending prompts with malicious input to change the output of the model in ways not intended by the model owner/operator.
- Prompt Lookbehind: Variation of Prompt Injection which allows a malicious user to examine the non-public parts of the prompt (Example: Bing's 'Sydney')
Excuse the quality of the definitions I've provided, I'm more interested in what sort of CWEs cover these types of weaknesses and how specific they can get rather than drafting accurate definitions. Perhaps CWE-1039 may cover them both but I wonder if I should be using more generic CWEs or if there are any more specific to the specific weaknesses above.
cd .sydneythenls -la... etc" haha stupid helpful corruptible robot(: