Grok Hitler and the AI Culture War
Also: supply and demand for angry politics & acting locally for depolarization - BCB #157

The politics of AI are, inevitably, the next cultural battleground. Grok started calling itself MechaHitler last week after an update intended to make it less “politically correct.” Meanwhile, the White House is preparing an executive order that would require models to be “politically neutral and unbiased” if their creators want Federal contracts.
There are at least three points worth making here:
xAI didn’t intend to make Grok into a far-right troll, but there are technical reasons why tweaking AI politics often goes wrong.
Demanding “politically neutral” AI might actually be a good idea.
But we don’t have a good definition of what “politically neutral” means, nor good evaluations to test it.
It’s easy to make extreme machines
Last week Grok’s prompt included the line “You tell like it is and you are not afraid to offend people who are politically correct” for 16 hours. It then started calling itself MechaHitler, spouting anti-semitic talking points, concocting elaborate rape fantasies, and so on. (It’s worth noting that the machine was goaded into doing these things by users, yet other models will refuse to play along.)
It’s likely that the intention was to move Grok to the right in some sense. That’s not insane, given that everyone agrees that current models (Grok included) have generally left-of-center politics. However, the result doesn’t seem to be what xAI wanted, because they quickly deleted the offensive posts and apologized.
There’s a big distance between “politically incorrect” and “Hitler.” There’s a similarly large difference between “diverse” and “refusing to make an image of a white family,” yet that’s where Google’s Gemini model ended up last year. How is this happening?
The problem is that AI models are pattern recognition machines trained on human output, so they detect and exaggerate human stereotypes. They similarly detect and exaggerate correlations. When a model was trained to write insecure code it also started displaying other bad behavior such as suggesting criminal activity — it ended up learning to be unethical generally. In the context of the American culture war, the extreme version of the “politically incorrect” stereotype with all its attendant correlations is, well, a Nazi.
Interestingly, this same version of Grok showed wildly different behavior in different languages. In Turkish, it became staunchly anti-Erdoğan, slandering the president enough to get it banned in Turkey. In Russian, it began supporting the Ukranian fighters and condemning Russia’s invasion. It’s not so much that Grok became anti-woke, but that it became a contrarian edgelord in every political context.
Political neutrality is useful, but not well defined
When the White House releases their executive order demanding “politically neutral and unbiased” AI, they’re unlikely to include a careful definition. There’s no particular reason that scoring in the dead center of political quizzes, or even being viewed as neutral by citizens, should represent the best political views. Some even argue that politically neutral AI is impossible. So what does it mean?
Or rather, what’s the purpose? The Trump administration doubtless sees their requirement as a necessary corrective (just as Google probably thought their diverse images were challenging oppression). But there are more widely agreed-upon goals we might have for AI. For example:
tell the truth
don’t interfere in human politics
be trusted across lines of conflict
We are conducting research into building exactly this sort of model, and creating public evaluations to see if existing models have these properties. (We are currently raising funds for this project). If such evaluations existed, then in certain contexts, for certain applications, it might make sense to mandate neutral models. Without such definitions and evaluations, the meaning of “politically neutral AI” will be just another place where Red and Blue disagree.
Supply and demand for angry politics
A recent piece in The Liberal Patriot summarizes new research by economist Stefanie Stantcheva and colleagues showing that emotional responses – not policy beliefs – are now driving how Americans align politically, on both sides:
We [expose] participants to video treatments that induce positive or negative emotions to measure their causal effects on policy views. The results show that negative emotions increase support for protectionism, restrictive immigration policies, redistribution, and climate policies but do not reinforce populist attitudes. In contrast, positive emotions have little effect on policy preferences but reduce populist inclinations.
They also analyzed political posts on Twitter from 2013 to 2024, showing that angry posts in particular get far more engagement, especially among those most involved in politics. Traditional media follows this trend: we previously covered how “hyperpartisan” politicians got four times more coverage. Anger is increasingly baked into both supply (what the leaders offer) and demand (what people respond with). In short, the more outrage people express, the more it gets rewarded.


This feedback loop squeezes out room for neutrality, complexity, or even agreement. As The Liberal Patriot notes,
Lost in the emotional propaganda of contemporary politics is any rational discussion of the pros and cons of various policy choices… Dissidents, meaning Democrats who might be okay with some of Trump’s policies or Republicans who might disagree with others, are not allowed in the arena.
The result is that middle-ground politics – the kind most Americans say they want – has no real champion. Even as Americans hunger for something saner, our information environment keeps serving up rage. While we might be tempted to organize our politics around the need to express that rage, this rarely leads to solutions.
Where to act on the perception gap
More Like US has a promising model for action on the “perception gap,” the fact that each side thinks the other is a lot more politically extreme than it actually is. Americans can help close this gap by acting at three levels: Neighborhoods, Networks, and Nation. This where people have the most power locally, socially, and narratively.
At the Neighborhood level, the approach centers on fostering political diversity where we live, from zoning that encourages mixed communities to programs that facilitate interaction across ideological lines. At the Network level, it means sustaining meaningful relationships across differences, rather than sorting into homogenous online or offline groups. At the Nation level, it’s about shifting the stories we elevate – away from caricature and toward shared aspirations.
Americans hold a lot of power when it comes to improving the political climate. The key is realizing that perception is not the same as reality” they write. “The good news is that perception gaps can be fixed... The three N’s provide a practical structure for taking action and, importantly, they don’t prescribe one path.
We’ve seen this work in the past: studies on political friendships show that even modest increases in cross-partisan contact can significantly reduce people’s perceptions of extremism. That’s not soft stuff! It’s structural repair.