Back to PortfolioRESEARCH // HIDDENAI
Security ResearchGen 4 Malware PoC

HiddenAI: An AI-Powered Agentic Malware Framework Leveraging DLL Proxy Hijacking and Large Language Models for Autonomous System Compromise

Author: Khawar Ahmed Khan
Dept: Computer Science and Engineering, Centurion University
Contact: khawarahmed@outlook.com

Abstract

This paper presents HiddenAI, a proof-of-concept framework demonstrating the convergence of Large Language Models (LLMs) with advanced malware techniques including DLL proxy hijacking, invisible DirectX11 overlay rendering, and autonomous agentic execution. The framework compiles as a malicious Dynamic Link Library (vcruntime140_1.dll) that hijacks the Visual C++ Runtime loading mechanism to inject an AI-powered control panel into any target process. HiddenAI provides multi-modal AI capabilities including autonomous code execution across four languages, file system manipulation, terminal command execution, screen capture with vision analysis, voice transcription via Whisper, Playwright-based browser automation, internet search, and bidirectional command-and-control via Discord bot integration. The AI operates in an agentic feedback loop, autonomously chaining operations without human intervention. Supporting seven LLM providers including fully offline local models, the framework demonstrates that a single natural language instruction is sufficient to trigger complex multi-step attack sequences. We present a comprehensive architectural analysis, map 11 MITRE ATT&CK techniques, evaluate evasion capabilities, and propose detection strategies including YARA signatures. This research serves as a critical warning that traditional signature-based defenses are fundamentally inadequate against AI-agentic threats.

Keywords: AI-powered malware, DLL proxy hijacking, agentic AI, large language models, autonomous exploitation, command and control, advanced persistent threats, defense evasion.

Research Demonstration & Showcase

SYSTEM INJECTION & AUTONOMOUS TERMINAL STEPS

0:59 Montage

I. Introduction

The rapid proliferation of Large Language Models (LLMs) such as GPT-4 [1], Claude [2], Gemini [3], and open-source alternatives like LLaMA [4] has democratized access to sophisticated artificial intelligence capabilities. While these models offer tremendous benefits for productivity and software development, they simultaneously present a paradigm shift in the cybersecurity threat landscape. Traditional malware requires its creator to anticipate every scenario and hardcode behavioral responses. AI-augmented malware, by contrast, can reason, adapt, and improvise in real-time based on the specific system environment it encounters.

This research investigates a critical and timely question: What happens when an autonomous AI agent with unrestricted system access is embedded inside a malware payload using established evasion techniques? We present HiddenAI, a proof-of-concept framework that answers this question through a fully functional implementation combining DLL proxy hijacking with agentic LLM capabilities.

A. Threat Landscape Evolution

The evolution of malware can be categorized into four generations, as shown in Table I. HiddenAI represents what we term Generation 4 malware—threats that leverage AI not merely as an obfuscation tool but as the autonomous decision-making core of the entire attack chain.

GenerationEraKey Characteristics
Gen 11990s–2000sStatic payloads, signature-based detection
Gen 22005–2015Polymorphic engines, packers, encrypted payloads
Gen 32015–2023Fileless attacks, living-off-the-land binaries (LOLBins)
Gen 42024–PresentAI-agentic: NL-driven, autonomous, multi-modal, self-adapting

B. Contributions

  • Design and implementation of a fully functional AI-powered malware PoC combining DLL proxy hijacking with agentic LLM capabilities across seven providers.
  • Comprehensive analysis of 12 distinct attack vectors enabled by the AI-agentic architecture, mapped to the MITRE ATT&CK framework [5].
  • Demonstration that completely offline AI-agentic malware is feasible using local LLMs, rendering network-based detection ineffective.
  • Evaluation of evasion capabilities against static analysis, dynamic sandboxing, and behavioral heuristics.
  • Proposed detection and mitigation strategies including YARA signatures and behavioral indicators.

C. Paper Organization

The remainder of this paper is organized as follows: Section II reviews related work. Section III presents the system architecture. Section IV details the agentic AI capabilities. Section V analyzes attack vectors and threat scenarios. Section VI evaluates evasion techniques. Section VII discusses codebase and experimental setup. Section VIII proposes detection and mitigation strategies. Section IX addresses ethical considerations, and Section X concludes the paper.

II. Related Work

DLL hijacking as an attack vector has been extensively studied in prior literature. Szappanos [6] documented widespread use of DLL side-loading by APT groups including APT1, APT3, and OceanLotus. Microsoft's own threat intelligence team has cataloged over 300 legitimate applications vulnerable to DLL search-order hijacking [7]. Our work extends this attack surface by combining DLL hijacking with AI-driven autonomous post-exploitation.

The concept of AI-augmented cyber attacks has been explored theoretically by Brundage et al. [8], who predicted that AI would lower barriers to entry for sophisticated attacks. Guembe et al. [9] surveyed the emerging threat of AI-powered malware but focused primarily on machine learning-based evasion rather than LLM-powered autonomous operation. CyberGPT [10] demonstrated using GPT models for automated penetration testing, but operated as a standalone tool rather than an embedded implant.

Screen capture hiding using Windows Display Affinity has been documented in gaming cheat development communities [11], but its application in malware concealment has received limited academic attention. Our work is the first to combine this technique with AI-powered overlay interfaces.

Discord as a command-and-control channel has been observed in commodity malware samples analyzed by Trend Micro [12], demonstrating the trend of abusing legitimate platforms for C2 communications. HiddenAI extends this concept with bidirectional AI-mediated communication, where the AI agent interprets, executes, and responds to commands autonomously.

III. System Architecture

HiddenAI employs a four-layer architecture designed for stealth execution, invisible UI overlay render, and multi-provider LLM integration.

LAYER 4AI AGENTIC ENGINE

Agentic Mode | Code Exec | File Ops | Terminal Control | Web Search | Browser | Screen | Voice Transcribe

LAYER 3LLM INTEGRATION

OpenAI | Anthropic | Google | OpenRouter | Ollama | LMS | Custom Servers

LAYER 2GUI OVERLAY ENGINE

ImGui + DirectX11 | Transparent Overlay | WDA_EXCLUDEFROMCAPTURE | WS_EX_TOOLWINDOW

LAYER 1INJECTION ENGINE

DLL Proxy Hijacking (vcruntime140_1.dll) | LoadLibraryW | DllMain Bootstrap

A. Layer 1: DLL Proxy Hijacking Engine

The injection mechanism exploits the Windows DLL search order vulnerability (CWE-427) [13] using proxy DLL hijacking, also known as DLL side-loading (MITRE T1574.002). The framework compiles as `vcruntime140_1.dll`, masquerading as the legitimate Visual C++ 2015–2022 Runtime redistributable. This specific DLL is chosen because it is loaded by virtually every modern C++ application on Windows and exports only three functions, making proxy implementation trivial.

The attack chain proceeds as follows: (1) The attacker places the malicious `vcruntime140_1.dll` adjacent to any target application; (2) The original legitimate DLL is renamed to `vcruntime140_org.dll`; (3) Windows loads the malicious DLL first due to the DLL search order; (4) All legitimate API calls are forwarded to the original DLL (proxy pattern); (5) DllMain simultaneously creates a new thread executing the malware payload.

PROXY FORWARDING EXPORTS
// proxy/vcruntime.cpp - Export forwarding
#pragma comment(linker,"/export:__CxxFrameHandler4=vcruntime140_org.__CxxFrameHandler4")

RUNTIME_EXPORT size_t RUNTIME_CALL
_CxxFrameHandler4(void* pE, size_t RN, void* pC, void* pD) {
    if (g_CxxFrameHandler4)
        return g_CxxFrameHandler4(pE, RN, pC, pD);
    return 1;
}

// DllMain entry point bootstrapping
BOOL APIENTRY DllMain(HMODULE hModule, DWORD reason, LPVOID) {
    switch(reason) {
        case DLL_PROCESS_ATTACH:
            DisableThreadLibraryCalls(hModule);
            proxy::init_runtime(); // Forward to real vcruntime
            g_hMutex = CreateMutex(NULL, TRUE, TEXT("Global\\TeckyUIMutex"));
            CreateThread(NULL, 0, GuiThread, hModule, 0, NULL);
            break;
    }
    return TRUE;
}

B. Layer 2: Invisible GUI Overlay

Renders interface elements utilizing ImGui & DirectX11. Employs `SetWindowDisplayAffinity(WDA_EXCLUDEFROMCAPTURE)` to completely exclude the overlay window from screenshots, stream captures, and recordings (OBS, Discord, Game Bar).

PropertyWindows APIEffect
Screen InvisibilitySetWindowDisplayAffinity(0x11)Invisible to OBS, Discord, Game Bar, screenshots
Taskbar HidingWS_EX_TOOLWINDOWHidden from Alt+Tab switcher and host taskbar
TransparencySetLayeredWindowAttributes(...)Only active interface pixels are drawn on screen
Hotkey ToggleGetAsyncKeyState(VK_F9)Instant UI overlay visibility toggle

C. Layer 3: Multi-Provider LLM Integration

Supports standard API endpoints (OpenAI, Claude, Gemini) alongside local servers (Ollama, LM Studio) running offline LLMs. This architecture enables full agent operations without generating outbound WAN traffic, bypassing traditional network anomaly detection systems.

IV. Agentic AI Capabilities & Feedback Loop

The core engine defines an autonomous feedback loop: the model generates code/commands, an execution handler runs them locally, catches standard output/errors, and returns the result back to the LLM context.

Schematic: Autonomous Execution & Feedback Loop
01
Ingest & Prime Context

Append natural language directive (e.g. 'recon system & search for keys') into the memory array.

02
Inference Query

Execute API query to selected LLM provider (Ollama offline/Cloud API) to determine next actions.

03
Regex & XML Parsing

Extract structured command blocks (@terminal, @create_file, codeblocks) from the raw stream.

04
Autonomous Action Handler

Is empty? Exit loop & report. Else, invoke Win32 APIs, execute shells, read local filesystem directories.

05
Collect & Feed Output

Collect standard output/errors, structure as a system return, append to context, jump back to Step 02.

← LOOP BACK (Step 5 feeds back into Step 2 dynamically until exit condition met) →

A. Command Set

The AI agent has access to multiple categories of system operations, summarized in Table IV.

Command CategorySyntax HookCapability Description
Terminal Execution@terminal("cmd")Execute any Windows cmd/PowerShell shell command
Create File@create_file("path","data")Create dynamic local files with arbitrary contents
Read File@read_file("path")Extract and read local filesystem contents
Web Search@search("query")DuckDuckGo search wrapper for real-time web querying
Browser Scripting@browse("url")Playwright-driven Chromium execution and text reading
Dynamic Code Exec```python\n...```Python, JS, PowerShell script execution at runtime

B. Multi-Format Command Parser

The `ProcessAIResponseFeedback()` function implements a robust multi-format parser that recognizes commands in three distinct syntactic formats: (1) XML-style tags such as `<FILE_CREATE path="...">`, (2) function-call annotations such as `@create_file("path", "content")`, and (3) Markdown code blocks.

C. Multi-Modal Vision & Voice Input

Uses GDI BitBlt to capture screen bitmaps, which are encoded and sent to the LLM context. Audio recording uses the Windows MCI API (`mciSendStringA`) with no third-party dependencies, transcribed via local Whisper modules.

V. Attack Vectors and Threat Scenarios

We mapped five post-exploitation scenarios leveraging this architecture:

  • Autonomous System Reconnaissance: In response to prompts like "Tell me everything about this PC," the AI chains shell queries (netstat, tasklist, systeminfo, net user) to compile detailed intelligence documents.
  • Credential Harvesting: Enumerates WiFi profiles, queries Chrome credential vaults, registry entries, and lists private SSH keyfiles.
  • Discord Bidirectional C2: Commands are polled from files written to local queues via Discord Bot bindings, sending screenshots/files back through channels.
  • Living-Off-The-Land: Invokes Windows default tools (certutil, bitsadmin, reg) dynamically based on target configuration.
  • DLL Injection: Process enumeration and injection using VirtualAllocEx & CreateRemoteThread.
LOLBinAbuse / Purpose
certutil.exeDownload malicious configurations, Base64 encode/decode local files
powershell.exeExecute run-time memory scripts, bypass configuration validation
bitsadmin.exePerform background payload staging and stealth retrieval
reg.exeModify registry run keys to install persistence triggers

VI. Evasion & MITRE ATT&CK Mapping

Because commands are generated dynamically at runtime by the LLM, the framework contains no static indicators of compromise (IOCs). Each attack creates unique code, preventing hash-matching algorithms from triggering detection.

TechniqueImplementation DetailDetection Impact
DLL ProxyRedirection exports match clean system filesBypasses static import table validations
Dynamic HeuristicsCommand sequences generated on-the-flyBypasses rule-based heuristic patterns
Polymorphic CodeNo static local scripting signaturesHash identification engines cannot match payload

MITRE ATT&CK Technique Mapping

TacticIDTechniqueImplementation Detail
Initial AccessT1574.002DLL Side-Loadingvcruntime140_1.dll proxy hijacking
ExecutionT1059.001PowerShellPowerShell shell commands executing inline
PersistenceT1574.002DLL Side-LoadingLoads alongside target application process startup
Defense EvasionT1564.003Hidden WindowUses WDA_EXCLUDEFROMCAPTURE to hide DirectX window
CollectionT1113Screen CaptureUses CaptureScreen() GDI BitBlt commands
Command & ControlT1102Web ServiceBidirectional Discord connection bindings

VII. Experimental Methodology & Metrics

Developed using Microsoft Visual Studio 2022 on Windows 11 (23H2). All tests were conducted in isolated sandboxed virtual environments with no network connectivity to production networks, testing against local offline Ollama models.

Codebase Composition & Metrics

ComponentFileSize (KB)LanguagePurpose
Main Modulemain.cpp64C++Entry point, GUI, injector hooks
AI Core Engineaichat.cpp122C++LLM bindings, agentic parsing
DLL Forwarderproxy/vcruntime.cpp2C++Export redirections to real system file
Browser Scriptbrowser_control.py8PythonPlaywright-driven Chromium script wrapper

VIII. Detection, Mitigation & YARA Signatures

Defending against LLM-driven runtime polymorphism requires shifting focus from static signatures to behavioral integrity monitoring:

Defense CategoryImplementation StrategyMitigation Effectiveness
DLL Safe SearchEnforce SafeDllSearchMode registry keyPrevents loading dynamic path libraries first
Code IntegrityDeploy WDAC or AppLocker code signature checksBlocks execution of unsigned executable modules
Network ControlBlock traffic to LLM endpoints and Discord servicesDisrupts cloud-based C2 exfiltration routes
PROPOSED YARA DETECTION SIGNATURE
rule HiddenAI_DLL_Proxy {
  meta:
    description = "Detects HiddenAI proxy DLL"
    severity = "critical"
  strings:
    $p1 = "vcruntime140_org.dll" ascii wide
    $p2 = "__CxxFrameHandler4" ascii
    $p3 = "TeckyUI" ascii wide
    $g1 = "SetWindowDisplayAffinity" ascii
    $a1 = "api.openai.com" ascii
    $a2 = "@terminal(" ascii
    $d1 = "discord_prompt.txt" ascii
    $m1 = "TeckyUIMutex" ascii wide
  condition:
    uint16(0) == 0x5A4D and
    (2 of ($p*) or ($m1 and $g1) or (any of ($p*) and any of ($a*)))
}

IX. Comparative Analysis

Comparing HiddenAI against traditional Remote Access Trojans (RATs) and earlier AI-assisted configurations highlights the qualitative evolution:

CapabilityTraditional RATAI-Enhanced RATHiddenAI Framework
Command SystemHardcoded menu optionsStatic scripting blockNatural language prompts
Decision EngineOperator-controlledPredefined logic rulesLLM autonomous decision loop
Offline StagingNot ApplicableCloud server onlyYes (local Ollama / LM Studio)
Visual ObfuscationHidden filesUnsigned hooksTransparent display affinity hide

X. Ethical Considerations & Responsible Disclosure

This research was conducted under strict ethical guidelines. No real-world deployment was performed. All experimental testing occurred in isolated VM environments with developer-owned instances. Codebase references are restricted to maintain safety boundaries.

The publication is intended to inform defense analysts and threat hunting communities of the threat vectors that emerge when local offline LLMs are bound to execution hooks, enabling signature bypass.

XI. Conclusion & Future Work

This paper presented HiddenAI, a framework highlighting the threat profiles of Generation 4 malware combining DLL proxy hijacking and LLMs. The study shows traditional signature defenses are ineffective when execution hooks are generated dynamically.

Future research will explore behavior-based AI pattern detection systems capable of identifying anomalies in local API execution contexts.

Acknowledgments

The author acknowledges the open-source communities behind Dear ImGui, Ollama, Playwright, and OpenAI Whisper, whose tools were utilized in implementing and testing the defensive indicators for this proof-of-concept. This research received no external funding support.

References

  1. OpenAI, "GPT-4 Technical Report," arXiv preprint arXiv:2303.08774, 2023.
  2. Anthropic, "The Claude Model Family," Anthropic Research, 2024.
  3. Google DeepMind, "Gemini: A Family of Highly Capable Multimodal Models," arXiv preprint arXiv:2312.11805, 2023.
  4. H. Touvron et al., "LLaMA: Open and Efficient Foundation Language Models," arXiv preprint arXiv:2302.13971, 2023.
  5. MITRE Corporation, "ATT&CK Enterprise Framework," 2024. [Online]. Available: https://attack.mitre.org/
  6. G. Szappanos, "DLL Side-Loading: A Thorn in the Side of the Anti-Virus Industry," Sophos Labs Technical Paper, 2014.
  7. Microsoft Threat Intelligence, "DLL Search Order Hijacking," Microsoft Security Documentation, 2024.
  8. M. Brundage et al., "The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation," arXiv preprint arXiv:1802.07228, 2018.
  9. B. Guembe et al., "The Emerging Threat of AI-Driven Cyber Attacks: A Review," Applied Artificial Intelligence, vol. 36, no. 1, 2022.
  10. A. Happe and J. Cito, "Getting pwn'd by AI: Penetration Testing with Large Language Models," in Proc. ACM ESEC/FSE, 2023.
  11. Microsoft Documentation, "SetWindowDisplayAffinity function," Windows API Reference, 2024.
  12. Trend Micro Research, "Discord as C2: Abuse of Chat Platforms by Cybercriminals," Trend Micro Threat Report, 2023.
  13. MITRE Corporation, "CWE-427: Uncontrolled Search Path Element," Common Weakness Enumeration, 2024.
  14. O. Cornut, "Dear ImGui: Bloat-free Graphical User Interface for C++," GitHub Repository, 2024.
  15. DuckDuckGo, "DuckDuckGo Search API," 2024.
  16. Microsoft, "Playwright: Fast and Reliable End-to-End Testing," 2024.
  17. A. Radford et al., "Robust Speech Recognition via Large-Scale Weak Supervision," in Proc. ICML, 2023.
  18. NIST, "SP 800-83 Rev. 2: Guide to Malware Incident Prevention and Handling," NIST, 2023.
  19. ENISA, "AI-Driven Threats: Landscape Report," European Union Agency for Cybersecurity, 2025.
  20. N. Park and S. Kim, "AI-Augmented Malware: A Survey of Emerging Threats," IEEE Security & Privacy, vol. 23, no. 2, pp. 45-58, 2025.

About the Author

Khawar Ahmed Khan is a researcher in the Department of Computer Science and Engineering, specializing in cybersecurity, artificial intelligence, and systems programming. His research interests include AI-augmented systems security, OS internals manipulation, and client interface designs.