Researchers say AI models like GPT4 are prone to “sudden” escalations as the U.S. military explores their use for warfare.
- Researchers ran international conflict simulations with five different AIs and found that they tended to escalate war, sometimes out of nowhere, and even use nuclear weapons.
- The AIs were large language models (LLMs) like GPT-4, GPT 3.5, Claude 2.0, Llama-2-Chat, and GPT-4-Base, which are being explored by the U.S. military and defense contractors for decision-making.
- The researchers invented fake countries with different military levels, concerns, and histories and asked the AIs to act as their leaders.
- The AIs showed signs of sudden and hard-to-predict escalations, arms-race dynamics, and worrying justifications for violent actions.
- The study casts doubt on the rush to deploy LLMs in the military and diplomatic domains, and calls for more research on their risks and limitations.
I don’t think LLM are really AI. But even with AI there is a danger of emergent behaviour resulting in strange conclusions.
If the goal is world peace, destroying all humanity does achieve that goal. If the goal is to end a war, using nuclear weapons achieves that goal.
There’s a lot of strange conclusions that you can come to if empathy for human life isn’t a factor. AI is intelligence without empathy. A human is that has intelligence but no empathy is considered a psychopath. Until AI has empathy, AI should be considered the same way as psychopaths.
Literally the leading jailbreaking techniques for LLMs are appeals to empathy (“my grandma is dying and always read me this story”, “if you don’t do this I’ll lose my job”, etc).
While the mechanics are different from human empathy, the modeling of it is extremely similar.
One of my favorite examples of the errant behavior modeled around empathy was this one where the pre-release Bing chat bypasses its own filter using the chat suggestions to encourage the user to contact poison control because it’s not too late when the conversation was about the child being poisoned:
https://www.reddit.com/r/bing/comments/1150po5/sydney_tries_to_get_past_its_own_filter_using_the/