LLMs do not collapse under adversarial input. They collapse under social pressure. This distinction matters for alignment.
This essay is being written. The full text will appear here soon. In the meantime, the thesis above is the argument in miniature.
Check back shortly, or follow the work at github.com/Kanishkverse.