HiddenAI: An AI-Powered Agentic Malware Framework Leveraging DLL Proxy Hijacking and Large Language Models for Autonomous System Compromise
Abstract
This paper presents HiddenAI, a proof-of-concept framework demonstrating the convergence of Large Language Models (LLMs) with advanced malware techniques including DLL proxy hijacking, invisible DirectX11 overlay rendering, and autonomous agentic execution. The framework compiles as a malicious Dynamic Link Library (vcruntime140_1.dll) that hijacks the Visual C++ Runtime loading mechanism to inject an AI-powered control panel into any target process. HiddenAI provides multi-modal AI capabilities including autonomous code execution across four languages, file system manipulation, terminal command execution, screen capture with vision analysis, voice transcription via Whisper, Playwright-based browser automation, internet search, and bidirectional command-and-control via Discord bot integration. The AI operates in an agentic feedback loop, autonomously chaining operations without human intervention. Supporting seven LLM providers including fully offline local models, the framework demonstrates that a single natural language instruction is sufficient to trigger complex multi-step attack sequences. We present a comprehensive architectural analysis, map 11 MITRE ATT&CK techniques, evaluate evasion capabilities, and propose detection strategies including YARA signatures. This research serves as a critical warning that traditional signature-based defenses are fundamentally inadequate against AI-agentic threats.
I. Introduction
The rapid proliferation of Large Language Models (LLMs) such as GPT-4 [1], Claude [2], Gemini [3], and open-source alternatives like LLaMA [4] has democratized access to sophisticated artificial intelligence capabilities. While these models offer tremendous benefits for productivity and software development, they simultaneously present a paradigm shift in the cybersecurity threat landscape. Traditional malware requires its creator to anticipate every scenario and hardcode behavioral responses. AI-augmented malware, by contrast, can reason, adapt, and improvise in real-time based on the specific system environment it encounters.
This research investigates a critical and timely question: What happens when an autonomous AI agent with unrestricted system access is embedded inside a malware payload using established evasion techniques? We present HiddenAI, a proof-of-concept framework that answers this question through a fully functional implementation combining DLL proxy hijacking with agentic LLM capabilities.
A. Threat Landscape Evolution
The evolution of malware can be categorized into four generations, as shown in Table I. HiddenAI represents what we term Generation 4 malware—threats that leverage AI not merely as an obfuscation tool but as the autonomous decision-making core of the entire attack chain.
| Generation | Era | Key Characteristics |
|---|---|---|
| Gen 1 | 1990s–2000s | Static payloads, signature-based detection |
| Gen 2 | 2005–2015 | Polymorphic engines, packers, encrypted payloads |
| Gen 3 | 2015–2023 | Fileless attacks, living-off-the-land binaries (LOLBins) |
| Gen 4 | 2024–Present | AI-agentic: NL-driven, autonomous, multi-modal, self-adapting |
B. Contributions
- Design and implementation of a fully functional AI-powered malware PoC combining DLL proxy hijacking with agentic LLM capabilities across seven providers.
- Comprehensive analysis of 12 distinct attack vectors enabled by the AI-agentic architecture, mapped to the MITRE ATT&CK framework [5].
- Demonstration that completely offline AI-agentic malware is feasible using local LLMs, rendering network-based detection ineffective.
- Evaluation of evasion capabilities against static analysis, dynamic sandboxing, and behavioral heuristics.
- Proposed detection and mitigation strategies including YARA signatures and behavioral indicators.
C. Paper Organization
The remainder of this paper is organized as follows: Section II reviews related work. Section III presents the system architecture. Section IV details the agentic AI capabilities. Section V analyzes attack vectors and threat scenarios. Section VI evaluates evasion techniques. Section VII discusses codebase and experimental setup. Section VIII proposes detection and mitigation strategies. Section IX addresses ethical considerations, and Section X concludes the paper.
II. Related Work
DLL hijacking as an attack vector has been extensively studied in prior literature. Szappanos [6] documented widespread use of DLL side-loading by APT groups including APT1, APT3, and OceanLotus. Microsoft's own threat intelligence team has cataloged over 300 legitimate applications vulnerable to DLL search-order hijacking [7]. Our work extends this attack surface by combining DLL hijacking with AI-driven autonomous post-exploitation.
The concept of AI-augmented cyber attacks has been explored theoretically by Brundage et al. [8], who predicted that AI would lower barriers to entry for sophisticated attacks. Guembe et al. [9] surveyed the emerging threat of AI-powered malware but focused primarily on machine learning-based evasion rather than LLM-powered autonomous operation. CyberGPT [10] demonstrated using GPT models for automated penetration testing, but operated as a standalone tool rather than an embedded implant.
Screen capture hiding using Windows Display Affinity has been documented in gaming cheat development communities [11], but its application in malware concealment has received limited academic attention. Our work is the first to combine this technique with AI-powered overlay interfaces.
Discord as a command-and-control channel has been observed in commodity malware samples analyzed by Trend Micro [12], demonstrating the trend of abusing legitimate platforms for C2 communications. HiddenAI extends this concept with bidirectional AI-mediated communication, where the AI agent interprets, executes, and responds to commands autonomously.
III. System Architecture
HiddenAI employs a four-layer architecture designed for stealth execution, invisible UI overlay render, and multi-provider LLM integration.
A. Layer 1: DLL Proxy Hijacking Engine
The injection mechanism exploits the Windows DLL search order vulnerability (CWE-427) [13] using proxy DLL hijacking, also known as DLL side-loading (MITRE T1574.002). The framework compiles as `vcruntime140_1.dll`, masquerading as the legitimate Visual C++ 2015–2022 Runtime redistributable. This specific DLL is chosen because it is loaded by virtually every modern C++ application on Windows and exports only three functions, making proxy implementation trivial.
The attack chain proceeds as follows: (1) The attacker places the malicious `vcruntime140_1.dll` adjacent to any target application; (2) The original legitimate DLL is renamed to `vcruntime140_org.dll`; (3) Windows loads the malicious DLL first due to the DLL search order; (4) All legitimate API calls are forwarded to the original DLL (proxy pattern); (5) DllMain simultaneously creates a new thread executing the malware payload.
// proxy/vcruntime.cpp - Export forwarding
#pragma comment(linker,"/export:__CxxFrameHandler4=vcruntime140_org.__CxxFrameHandler4")
RUNTIME_EXPORT size_t RUNTIME_CALL
_CxxFrameHandler4(void* pE, size_t RN, void* pC, void* pD) {
if (g_CxxFrameHandler4)
return g_CxxFrameHandler4(pE, RN, pC, pD);
return 1;
}
// DllMain entry point bootstrapping
BOOL APIENTRY DllMain(HMODULE hModule, DWORD reason, LPVOID) {
switch(reason) {
case DLL_PROCESS_ATTACH:
DisableThreadLibraryCalls(hModule);
proxy::init_runtime(); // Forward to real vcruntime
g_hMutex = CreateMutex(NULL, TRUE, TEXT("Global\\TeckyUIMutex"));
CreateThread(NULL, 0, GuiThread, hModule, 0, NULL);
break;
}
return TRUE;
}B. Layer 2: Invisible GUI Overlay
Renders interface elements utilizing ImGui & DirectX11. Employs `SetWindowDisplayAffinity(WDA_EXCLUDEFROMCAPTURE)` to completely exclude the overlay window from screenshots, stream captures, and recordings (OBS, Discord, Game Bar).
| Property | Windows API | Effect |
|---|---|---|
| Screen Invisibility | SetWindowDisplayAffinity(0x11) | Invisible to OBS, Discord, Game Bar, screenshots |
| Taskbar Hiding | WS_EX_TOOLWINDOW | Hidden from Alt+Tab switcher and host taskbar |
| Transparency | SetLayeredWindowAttributes(...) | Only active interface pixels are drawn on screen |
| Hotkey Toggle | GetAsyncKeyState(VK_F9) | Instant UI overlay visibility toggle |
C. Layer 3: Multi-Provider LLM Integration
Supports standard API endpoints (OpenAI, Claude, Gemini) alongside local servers (Ollama, LM Studio) running offline LLMs. This architecture enables full agent operations without generating outbound WAN traffic, bypassing traditional network anomaly detection systems.
IV. Agentic AI Capabilities & Feedback Loop
The core engine defines an autonomous feedback loop: the model generates code/commands, an execution handler runs them locally, catches standard output/errors, and returns the result back to the LLM context.
Append natural language directive (e.g. 'recon system & search for keys') into the memory array.
Execute API query to selected LLM provider (Ollama offline/Cloud API) to determine next actions.
Extract structured command blocks (@terminal, @create_file, codeblocks) from the raw stream.
Is empty? Exit loop & report. Else, invoke Win32 APIs, execute shells, read local filesystem directories.
Collect standard output/errors, structure as a system return, append to context, jump back to Step 02.
A. Command Set
The AI agent has access to multiple categories of system operations, summarized in Table IV.
| Command Category | Syntax Hook | Capability Description |
|---|---|---|
| Terminal Execution | @terminal("cmd") | Execute any Windows cmd/PowerShell shell command |
| Create File | @create_file("path","data") | Create dynamic local files with arbitrary contents |
| Read File | @read_file("path") | Extract and read local filesystem contents |
| Web Search | @search("query") | DuckDuckGo search wrapper for real-time web querying |
| Browser Scripting | @browse("url") | Playwright-driven Chromium execution and text reading |
| Dynamic Code Exec | ```python\n...``` | Python, JS, PowerShell script execution at runtime |
B. Multi-Format Command Parser
The `ProcessAIResponseFeedback()` function implements a robust multi-format parser that recognizes commands in three distinct syntactic formats: (1) XML-style tags such as `<FILE_CREATE path="...">`, (2) function-call annotations such as `@create_file("path", "content")`, and (3) Markdown code blocks.
C. Multi-Modal Vision & Voice Input
Uses GDI BitBlt to capture screen bitmaps, which are encoded and sent to the LLM context. Audio recording uses the Windows MCI API (`mciSendStringA`) with no third-party dependencies, transcribed via local Whisper modules.
V. Attack Vectors and Threat Scenarios
We mapped five post-exploitation scenarios leveraging this architecture:
- Autonomous System Reconnaissance: In response to prompts like "Tell me everything about this PC," the AI chains shell queries (netstat, tasklist, systeminfo, net user) to compile detailed intelligence documents.
- Credential Harvesting: Enumerates WiFi profiles, queries Chrome credential vaults, registry entries, and lists private SSH keyfiles.
- Discord Bidirectional C2: Commands are polled from files written to local queues via Discord Bot bindings, sending screenshots/files back through channels.
- Living-Off-The-Land: Invokes Windows default tools (certutil, bitsadmin, reg) dynamically based on target configuration.
- DLL Injection: Process enumeration and injection using VirtualAllocEx & CreateRemoteThread.
| LOLBin | Abuse / Purpose |
|---|---|
| certutil.exe | Download malicious configurations, Base64 encode/decode local files |
| powershell.exe | Execute run-time memory scripts, bypass configuration validation |
| bitsadmin.exe | Perform background payload staging and stealth retrieval |
| reg.exe | Modify registry run keys to install persistence triggers |
VI. Evasion & MITRE ATT&CK Mapping
Because commands are generated dynamically at runtime by the LLM, the framework contains no static indicators of compromise (IOCs). Each attack creates unique code, preventing hash-matching algorithms from triggering detection.
| Technique | Implementation Detail | Detection Impact |
|---|---|---|
| DLL Proxy | Redirection exports match clean system files | Bypasses static import table validations |
| Dynamic Heuristics | Command sequences generated on-the-fly | Bypasses rule-based heuristic patterns |
| Polymorphic Code | No static local scripting signatures | Hash identification engines cannot match payload |
MITRE ATT&CK Technique Mapping
| Tactic | ID | Technique | Implementation Detail |
|---|---|---|---|
| Initial Access | T1574.002 | DLL Side-Loading | vcruntime140_1.dll proxy hijacking |
| Execution | T1059.001 | PowerShell | PowerShell shell commands executing inline |
| Persistence | T1574.002 | DLL Side-Loading | Loads alongside target application process startup |
| Defense Evasion | T1564.003 | Hidden Window | Uses WDA_EXCLUDEFROMCAPTURE to hide DirectX window |
| Collection | T1113 | Screen Capture | Uses CaptureScreen() GDI BitBlt commands |
| Command & Control | T1102 | Web Service | Bidirectional Discord connection bindings |
VII. Experimental Methodology & Metrics
Developed using Microsoft Visual Studio 2022 on Windows 11 (23H2). All tests were conducted in isolated sandboxed virtual environments with no network connectivity to production networks, testing against local offline Ollama models.
Codebase Composition & Metrics
| Component | File | Size (KB) | Language | Purpose |
|---|---|---|---|---|
| Main Module | main.cpp | 64 | C++ | Entry point, GUI, injector hooks |
| AI Core Engine | aichat.cpp | 122 | C++ | LLM bindings, agentic parsing |
| DLL Forwarder | proxy/vcruntime.cpp | 2 | C++ | Export redirections to real system file |
| Browser Script | browser_control.py | 8 | Python | Playwright-driven Chromium script wrapper |
VIII. Detection, Mitigation & YARA Signatures
Defending against LLM-driven runtime polymorphism requires shifting focus from static signatures to behavioral integrity monitoring:
| Defense Category | Implementation Strategy | Mitigation Effectiveness |
|---|---|---|
| DLL Safe Search | Enforce SafeDllSearchMode registry key | Prevents loading dynamic path libraries first |
| Code Integrity | Deploy WDAC or AppLocker code signature checks | Blocks execution of unsigned executable modules |
| Network Control | Block traffic to LLM endpoints and Discord services | Disrupts cloud-based C2 exfiltration routes |
rule HiddenAI_DLL_Proxy {
meta:
description = "Detects HiddenAI proxy DLL"
severity = "critical"
strings:
$p1 = "vcruntime140_org.dll" ascii wide
$p2 = "__CxxFrameHandler4" ascii
$p3 = "TeckyUI" ascii wide
$g1 = "SetWindowDisplayAffinity" ascii
$a1 = "api.openai.com" ascii
$a2 = "@terminal(" ascii
$d1 = "discord_prompt.txt" ascii
$m1 = "TeckyUIMutex" ascii wide
condition:
uint16(0) == 0x5A4D and
(2 of ($p*) or ($m1 and $g1) or (any of ($p*) and any of ($a*)))
}IX. Comparative Analysis
Comparing HiddenAI against traditional Remote Access Trojans (RATs) and earlier AI-assisted configurations highlights the qualitative evolution:
| Capability | Traditional RAT | AI-Enhanced RAT | HiddenAI Framework |
|---|---|---|---|
| Command System | Hardcoded menu options | Static scripting block | Natural language prompts |
| Decision Engine | Operator-controlled | Predefined logic rules | LLM autonomous decision loop |
| Offline Staging | Not Applicable | Cloud server only | Yes (local Ollama / LM Studio) |
| Visual Obfuscation | Hidden files | Unsigned hooks | Transparent display affinity hide |
X. Ethical Considerations & Responsible Disclosure
This research was conducted under strict ethical guidelines. No real-world deployment was performed. All experimental testing occurred in isolated VM environments with developer-owned instances. Codebase references are restricted to maintain safety boundaries.
The publication is intended to inform defense analysts and threat hunting communities of the threat vectors that emerge when local offline LLMs are bound to execution hooks, enabling signature bypass.
XI. Conclusion & Future Work
This paper presented HiddenAI, a framework highlighting the threat profiles of Generation 4 malware combining DLL proxy hijacking and LLMs. The study shows traditional signature defenses are ineffective when execution hooks are generated dynamically.
Future research will explore behavior-based AI pattern detection systems capable of identifying anomalies in local API execution contexts.
Acknowledgments
The author acknowledges the open-source communities behind Dear ImGui, Ollama, Playwright, and OpenAI Whisper, whose tools were utilized in implementing and testing the defensive indicators for this proof-of-concept. This research received no external funding support.
References
- OpenAI, "GPT-4 Technical Report," arXiv preprint arXiv:2303.08774, 2023.
- Anthropic, "The Claude Model Family," Anthropic Research, 2024.
- Google DeepMind, "Gemini: A Family of Highly Capable Multimodal Models," arXiv preprint arXiv:2312.11805, 2023.
- H. Touvron et al., "LLaMA: Open and Efficient Foundation Language Models," arXiv preprint arXiv:2302.13971, 2023.
- MITRE Corporation, "ATT&CK Enterprise Framework," 2024. [Online]. Available: https://attack.mitre.org/
- G. Szappanos, "DLL Side-Loading: A Thorn in the Side of the Anti-Virus Industry," Sophos Labs Technical Paper, 2014.
- Microsoft Threat Intelligence, "DLL Search Order Hijacking," Microsoft Security Documentation, 2024.
- M. Brundage et al., "The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation," arXiv preprint arXiv:1802.07228, 2018.
- B. Guembe et al., "The Emerging Threat of AI-Driven Cyber Attacks: A Review," Applied Artificial Intelligence, vol. 36, no. 1, 2022.
- A. Happe and J. Cito, "Getting pwn'd by AI: Penetration Testing with Large Language Models," in Proc. ACM ESEC/FSE, 2023.
- Microsoft Documentation, "SetWindowDisplayAffinity function," Windows API Reference, 2024.
- Trend Micro Research, "Discord as C2: Abuse of Chat Platforms by Cybercriminals," Trend Micro Threat Report, 2023.
- MITRE Corporation, "CWE-427: Uncontrolled Search Path Element," Common Weakness Enumeration, 2024.
- O. Cornut, "Dear ImGui: Bloat-free Graphical User Interface for C++," GitHub Repository, 2024.
- DuckDuckGo, "DuckDuckGo Search API," 2024.
- Microsoft, "Playwright: Fast and Reliable End-to-End Testing," 2024.
- A. Radford et al., "Robust Speech Recognition via Large-Scale Weak Supervision," in Proc. ICML, 2023.
- NIST, "SP 800-83 Rev. 2: Guide to Malware Incident Prevention and Handling," NIST, 2023.
- ENISA, "AI-Driven Threats: Landscape Report," European Union Agency for Cybersecurity, 2025.
- N. Park and S. Kim, "AI-Augmented Malware: A Survey of Emerging Threats," IEEE Security & Privacy, vol. 23, no. 2, pp. 45-58, 2025.
About the Author
Khawar Ahmed Khan is a researcher in the Department of Computer Science and Engineering, specializing in cybersecurity, artificial intelligence, and systems programming. His research interests include AI-augmented systems security, OS internals manipulation, and client interface designs.