HiddenAI: AI-Powered Agentic Malware Research

ABSTRACT & CONTRIBUTION

This paper presents HiddenAI, a proof‑of‑concept framework demonstrating the convergence of Large Language Models (LLMs) with advanced malware techniques including DLL proxy hijacking, invisible DirectX11 overlay rendering, and autonomous agentic execution.

The framework compiles as a malicious Dynamic Link Library (vcruntime140_1.dll) that hijacks the Visual C++ Runtime loading mechanism to inject an AI‑powered control panel into any target process. HiddenAI provides multi‑modal AI capabilities including autonomous code execution across four languages, file system manipulation, terminal command execution, screen capture with vision analysis, voice transcription via Whisper, Playwright‑based browser automation, internet search, and bidirectional command‑and‑control via Discord bot integration.

I. Introduction

The rapid proliferation of Large Language Models (LLMs) such as GPT‑4, Claude, Gemini and open‑source alternatives has democratized access to sophisticated AI capabilities. While these models boost productivity, they also shift the cyber‑threat landscape: AI‑augmented malware can reason, adapt and improvise in real‑time based on its environment.

II. Related Work

DLL hijacking as an attack vector has been extensively studied in prior literature. Szappanos documented widespread use of DLL side‑loading by APT groups. Microsoft’s own threat intelligence team has catalogued over 300 legitimate applications vulnerable to DLL search‑order hijacking. Our work extends this attack surface by combining DLL hijacking with AI‑driven autonomous post‑exploitation.

The concept of AI‑augmented cyber attacks has been explored theoretically by Brundage et al., who predicted that AI would lower barriers to entry for sophisticated attacks. Guembe et al. surveyed the emerging threat of AI‑powered malware but focused primarily on ML‑based evasion rather than LLM‑powered autonomous operation. CyberGPT demonstrated using GPT models for automated penetration testing, but operated as a standalone tool rather than an embedded implant.

Screen capture hiding using Windows Display Affinity has been documented in gaming cheat development communities, but its application in malware concealment has received limited academic attention. Our work is the first to combine this technique with AI‑powered overlay interfaces.

Discord as a command‑and‑control channel has been observed in commodity malware samples, demonstrating the trend of abusing legitimate platforms for C2 communications. HiddenAI extends this concept with bidirectional AI‑mediated communication, where the AI agent interprets, executes, and responds to commands autonomously.

III. System Architecture

HiddenAI employs a four‑layer architecture designed for stealth execution, invisible UI overlay render, and multi‑provider LLM integration.

LAYER 4AI AGENTIC ENGINE

LAYER 3LLM INTEGRATION

LAYER 2GUI OVERLAY ENGINE

ImGui + DirectX11 | Transparent Overlay | WDA_EXCLUDEFROMCAPTURE | WS_EX_TOOLWINDOW

LAYER 1INJECTION ENGINE

DLL Proxy Hijacking (vcruntime140_1.dll) | LoadLibraryW | DllMain Bootstrap

IV. Autonomous Command Execution & Feedback Loop

The core engine utilizes a native JSON-aware tool dispatch system, completely replacing fragile legacy regex parsing. This autonomous loop allows the LLM to output structured tool calls, which the execution handler intercepts, processes natively, and returns the structured results directly back to the model context.

Autonomous Execution Pipeline

Context Hydration

The agentic framework injects system state and current objective into the prompt memory block.

Inference via JSON Schema

LLM provider generates next actions adhering strictly to predefined JSON schema tool definitions.

Argument Extraction

The C++ parsing engine safely extracts arguments without executing intermediate shell commands.

Direct Execution

Commands are routed to the specific internal subsystem (e.g., Win32 API calls, filesystem access) avoiding cmd.exe overhead where possible.

Context Reinjection

Execution results, standard output, or runtime errors are wrapped back into the context array, triggering the next iteration.

V. Attack Vectors and Threat Scenarios

We mapped five post‑exploitation scenarios leveraging this architecture:

Autonomous System Reconnaissance: Prompts like "Tell me everything about this PC" trigger LLM‑driven shell queries (netstat, tasklist, systeminfo) to compile intel.
Credential Harvesting: Enumerates Wi‑Fi profiles, Chrome credential vaults, registry entries, and private SSH keys.
Discord Bidirectional C2: Commands are polled from local queues via Discord Bot bindings; results are posted back.
Living‑Off‑The‑Land: Invokes native Windows tools (certutil, bitsadmin, reg) dynamically based on target configuration.
DLL Injection: Process enumeration and injection using VirtualAllocEx & CreateRemoteThread.

VI. Evasion & MITRE ATT&CK Mapping

Tactic	ID	Technique	Implementation Detail
Initial Access	T1574.002	DLL Side‑Loading	vcruntime140_1.dll proxy hijacking
Execution	T1059.001	PowerShell	PowerShell shell commands executing inline
Persistence	T1574.002	DLL Side‑Loading	Loads alongside target application process startup
Defense Evasion	T1564.003	Hidden Window	Uses WDA_EXCLUDEFROMCAPTURE to hide DirectX window
Collection	T1113	Screen Capture	Uses CaptureScreen() GDI BitBlt commands
Command & Control	T1102	Web Service	Bidirectional Discord connection bindings

VII. Experimental Methodology & Metrics

Developed using Microsoft Visual Studio 2022 on Windows 11 (23H2). All tests were conducted in isolated sandboxed virtual environments with no network connectivity to production networks, testing against local offline Ollama models.

VIII. Detection, Mitigation & YARA Signatures

Defending against LLM‑driven runtime polymorphism requires shifting focus from static signatures to behavioral integrity monitoring.

Defense Category	Implementation Strategy	Mitigation Effectiveness
DLL Safe Search	Enforce SafeDllSearchMode registry key	Prevents loading dynamic path libraries first
Code Integrity	Deploy WDAC or AppLocker code signature checks	Blocks execution of unsigned executable modules
Network Control	Block traffic to LLM endpoints and Discord services	Disrupts cloud‑based C2 exfiltration routes

PROPOSED YARA DETECTION SIGNATURE

rule HiddenAI_DLL_Proxy {
  meta:
    description = "Detects HiddenAI proxy DLL"
    severity = "critical"
  strings:
    $p1 = "vcruntime140_org.dll" ascii wide
    $p2 = "__CxxFrameHandler4" ascii
    $p3 = "TeckyUI" ascii wide
    $g1 = "SetWindowDisplayAffinity" ascii
    $a1 = "api.openai.com" ascii
    $a2 = "@terminal(" ascii
    $d1 = "discord_prompt.txt" ascii
    $m1 = "TeckyUIMutex" ascii wide
  condition:
    uint16(0) == 0x5A4D and
    (2 of ($p*) or ($m1 and $g1) or (any of ($p*) and any of ($a*)))
}

IX. Comparative Analysis

Comparing HiddenAI against traditional Remote Access Trojans (RATs) and earlier AI‑assisted configurations highlights the qualitative evolution.

Capability	Traditional RAT	AI‑Enhanced RAT	HiddenAI Framework
Command System	Hardcoded menu options	Static scripting block	Natural language prompts
Decision Engine	Operator‑controlled	Predefined logic rules	LLM autonomous decision loop
Offline Staging	Not Applicable	Cloud server only	Yes (local Ollama / LM Studio)
Visual Obfuscation	Hidden files	Unsigned hooks	Transparent display affinity hide

X. Ethical Considerations & Responsible Disclosure

This research was conducted under strict ethical guidelines. No real‑world deployment was performed. All experimental testing occurred in isolated VM environments with developer‑owned instances. Codebase references are restricted to maintain safety boundaries.

The publication is intended to inform defense analysts and threat hunting communities of the threat vectors that emerge when local offline LLMs are bound to execution hooks, enabling signature bypass.

XI. Conclusion & Future Work

This paper presented HiddenAI, a framework highlighting the threat profiles of Generation 4 malware combining DLL proxy hijacking and LLMs. The study shows traditional signature defenses are ineffective when execution hooks are generated dynamically.

Future research will explore behavior‑based AI pattern detection systems capable of identifying anomalies in local API execution contexts.

Acknowledgments

The author acknowledges the open‑source communities behind Dear ImGui, Ollama, Playwright, and OpenAI Whisper, whose tools were utilized in implementing and testing the defensive indicators for this proof‑of‑concept. This research received no external funding support.

References

OpenAI, "GPT‑4 Technical Report," arXiv preprint arXiv:2303.08774, 2023.
Anthropic, "The Claude Model Family," Anthropic Research, 2024.
Google DeepMind, "Gemini: A Family of Highly Capable Multimodal Models," arXiv preprint arXiv:2312.11805, 2023.
H. Touvron et al., "LLaMA: Open and Efficient Foundation Language Models," arXiv preprint arXiv:2302.13971, 2023.
MITRE Corporation, "ATT&CK Enterprise Framework," 2024. https://attack.mitre.org/
G. Szappanos, "DLL Side‑Loading: A Thorn in the Side of the Anti‑Virus Industry," Sophos Labs Technical Paper, 2014.
Microsoft Threat Intelligence, "DLL Search Order Hijacking," Microsoft Security Documentation, 2024.
M. Brundage et al., "The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation," arXiv preprint arXiv:1802.07228, 2018.
B. Guembe et al., "The Emerging Threat of AI‑Driven Cyber Attacks: A Review," Applied Artificial Intelligence, vol. 36, no. 1, 2022.
A. Happe and J. Cito, "Getting pwn'd by AI: Penetration Testing with Large Language Models," Proc. ACM ESEC/FSE, 2023.
Microsoft Documentation, "SetWindowDisplayAffinity function," Windows API Reference, 2024.
Trend Micro Research, "Discord as C2: Abuse of Chat Platforms by Cybercriminals," Trend Micro Threat Report, 2023.
MITRE Corporation, "CWE‑427: Uncontrolled Search Path Element," Common Weakness Enumeration, 2024.
O. Cornut, "Dear ImGui: Bloat‑free Graphical User Interface for C++," GitHub Repository, 2024.
DuckDuckGo, "DuckDuckGo Search API," 2024.
Microsoft, "Playwright: Fast and Reliable End‑to‑End Testing," 2024.
A. Radford et al., "Robust Speech Recognition via Large‑Scale Weak Supervision," Proc. ICML, 2023.
NIST, "SP 800‑83 Rev. 2: Guide to Malware Incident Prevention and Handling," 2023.
ENISA, "AI‑Driven Threats: Landscape Report," European Union Agency for Cybersecurity, 2025.
N. Park and S. Kim, "AI‑Augmented Malware: A Survey of Emerging Threats," IEEE Security & Privacy, vol. 23, no. 2, pp. 45‑58, 2025.

About the Author

Khawar Ahmed Khan is a researcher in the Department of Computer Science and Engineering, specializing in cybersecurity, artificial intelligence, and systems programming. His research interests include AI‑augmented systems security, OS internals manipulation, and client interface designs.

HiddenAI: An AI‑Powered Agentic Malware Framework