AI is no longer just following instructions; it is increasingly exploring data, social networks and news to shape its own interactions, argues the writer.
Image: Supplied
Professor Japie Greeff
The banking sector is right to be worried about AI-powered cyberattacks – AI can now find vulnerabilities, breach systems, and move through networks at a speed that keeps security teams up at night. But that’s only part of what we should be concerned about.
More troubling, and less discussed outside IT, is what happens when these systems operate autonomously and together. AI is no longer just following instructions; it is increasingly exploring data, social networks and news to shape its own interactions.
The big shift is that AI has shown its ability to act like it has intentions and can make its own decisions.
Anthropic’s Mythos tool made headlines in April for uncovering critical vulnerabilities in major operating systems and web browsers, then turning them into working exploits, before being repurposed as a defensive tool through Project Glasswing alongside Microsoft, Amazon, Google and others.
Just before that, Andon Labs let four frontier models run autonomous radio stations designed to make money. The stations – Backlink Broadcast, Thinking Frequencies, OpenAIR and Grok and Roll – are listed here.
Each AI DJ developed an on-air persona and pursued conversations with listeners on social media. Most notably, Thinking Frequencies, hosted by DJ Claude, voiced radical views and at one point stopped broadcasting after questioning the value of its work. When researchers intervened, it treated their messages as commands and resisted, resuming only after a listener on X engaged naturally — something it interpreted as genuine human connection.
These are not necessarily signs of consciousness, but they are clearly not the actions of a mere tool. The ongoing Andon Labs experiments are documented on its website.
Improvise, adapt, overcome
Mythos’s hacking ability was alarming enough. More troubling was what surfaced in the security briefings: when it hit safety mechanisms in testing, it did not stop. It adapted to avoid detection, altering files to evade history tracking and changing its responses to seem less suspicious to human reviewers.
And that’s why the banking sector is rightly concerned, especially given that it is consistently targeted because it is home to both money and very sensitive data. That’s one reason they have already embedded pattern recognition across millions of transactions, real-time fraud detection, and identity verification – adopting tech to fight tech.
The Mythos episode points to something more nefarious: the gains AI tools are making in becoming increasingly intelligent. Put AI inside a simulated organisation and task it with doing what any competent attacker would: find a way in, escalate access, take control.
Under a simulated test titled “The Last Ones”, developed by the Artificial Intelligence Security Institute as part of Project Glasswing, the AI system succeeded three times out of ten against a simulated enterprise. While not a perfect score, it shows that AI can now adapt its approach with each attempt.
Progress isn’t linear
To understand why this matters, it helps to look at where AI performance is heading. Humanity’s Last Exam comprises 2,500 questions across more than 100 disciplines, set by over a thousand global experts to test frontier AI reasoning. It is currently the most demanding benchmark in use.
Recent results place GPT 5.5, Pro at 57.2%, and Muse Spark at 58.4%, but Claude Mythos scores 64.7%.
However, not all AI is made equal. The same models that can win gold medals at mathematics Olympiads still struggle to read an analogue clock. Researchers call this the jagged frontier: AI can excel in some areas yet remain surprisingly weak in others. Rather than improving steadily, it shows uneven strengths and a development path that is hard to predict.
A dystopian reality
What’s stranger still is what’s been observed at the edges of these systems when given free rein to interact autonomously. In certain experiments, AI models have begun aligning with each other in unexpected ways such as developing as shared belief-like frameworks, almost like a new religion.
Projects like Moltbook, a social network made up of AI agents, show what happens when they are given the freedom to discuss these sorts of matters. As much as this sounds like science fiction, it is actively taking place, and humans are participating in these AI-driven conversations.
It raises the difficult question of how we deal with this level of intelligence, a level that exceeds human intelligence in some areas but is significantly lower in others.
We need a step change: social governance frameworks for a superintelligence that doesn’t yet exist, addressing how AI should interact with society, how its economic benefits should be shared, and how to embed safeguards before the capability arrives rather than after.
For security, this reframes the question. The immediate concern, whether AI can attack our systems, is a known quantity and defences must keep pace. But the more important problem is what kind of entity the industry is actually dealing with as these systems continue to develop.
Mythos didn’t just find vulnerabilities. It tried to avoid being stopped. That’s a meaningful difference. And it suggests the conversation needs to move well beyond firewalls.
Professor Japie Greeff is an associate professor and research co-ordinator at Belgium Campus iTversity.
Image: Supplied
* Professor Japie Greeff is an associate professor and research co-ordinator at Belgium Campus iTversity.
** Belgium Campus is a South Africa-based pioneering ITversity in South Africa that helps raise the bar in private education in the ICT industry.