OpenAI has revealed that it disrupted three separate clusters of malicious activity where its ChatGPT AI tool was being exploited to aid malware development.
One cluster involved a Russian-speaking actor who used ChatGPT to help design and refine a remote access trojan (RAT) aimed at stealing credentials while evading detection. The operator employed multiple ChatGPT accounts to prototype and troubleshoot code components that facilitated post-exploitation activities and credential theft. While the AI refused direct requests for malicious outputs, the actor worked around these restrictions by creating modular code blocks, which were then combined to execute their attacks. Examples included code for clipboard monitoring, obfuscation, and data exfiltration via Telegram bots, though none of these components were inherently harmful on their own.
The second cluster originated in North Korea and aligned with a previously reported campaign targeting South Korean diplomatic missions. In this case, ChatGPT was used for malware development, command-and-control infrastructure, and tooling efforts such as creating macOS Finder extensions, configuring Windows Server VPNs, and adapting Chrome extensions for Safari. The actors also used AI to draft phishing emails, experiment with cloud services, and develop techniques for DLL loading, in-memory execution, API hooking, and credential theft.
The third cluster was linked to a Chinese hacking group known as UNK_DropPitch (UTA0388), which had previously targeted investment firms in Taiwan with the HealthKick backdoor. ChatGPT was leveraged to generate phishing content in multiple languages, streamline routine tasks like remote execution and HTTPS traffic management, and support research on open-source tools. OpenAI described these actors as “technically competent but unsophisticated.”
Beyond these three clusters, OpenAI also blocked accounts linked to scams and influence operations originating in countries including Cambodia, Myanmar, Nigeria, and China. These activities included creating AI-generated content for investment scams, surveilling individuals, producing politically charged content, and generating social media strategies such as TikTok campaigns. OpenAI noted that these actors sought novel capabilities from the AI that could not be easily replicated through publicly available resources, and in some cases attempted to obscure AI involvement—for instance, by removing em-dashes, a known sign of AI-generated text.
OpenAI’s findings highlight how threat actors are increasingly adapting their tactics to leverage AI tools for both efficiency and anonymity, while also signaling the growing importance of AI auditing. In parallel, AI company Anthropic released Petri, an open-source auditing tool designed to test AI systems across categories like deception, harmful requests, and model behavior, enabling faster, multi-dimensional evaluation of potential risks.

