TL;DR — Spoofed fingerprints fail not because the spoof is weak, but because spoofs are internally inconsistent. Modern detection is no longer "is navigator.webdriver true?" — it is coherence analysis across the hardware stack, the TLS layer, the CDP runtime, and behavioral telemetry. This guide gives risk engineers a tiered detection model with concrete heuristics, working code, and the open research it is built on.
A modern antidetect browser ships a plausible fingerprint: a Windows 11 UA with a matching font stack, a randomized canvas hash, a noised AudioContext signature, and a residential-IP-routed TLS session. Each signal, viewed alone, looks fine.
The detection insight is that a real device produces a fingerprint that is internally consistent and externally stable:
Internal consistency — the GPU declared in WebGL_debug_renderer_info must match the shader precision, the supported extensions, and the maximum texture size that GPU model actually has.
External stability — the same hardware should produce the same canvas hash, the same audio buffer hash, and the same WebGL pixel readback across sessions, modulo browser version.
Antidetect browsers break one or both. The detector's job is to find where.
This is the same principle Eckersley established in How Unique Is Your Web Browser? (EFF, 2010) [1] and that Acar et al. extended for canvas fingerprinting [2]. We are applying it inverted: not to track users, but to flag fingerprints that cannot exist on real hardware.
These are the cheapest and most decisive signals. Collect on the client, evaluate server-side. Never trust client-side scoring.
Naive antidetect browsers add per-pixel jitter to canvas output. The hash changes, but the noise itself is statistically detectable.
// client.js — collected, sent to server, never evaluated client-side
function canvasProbe() {
const draw = () => {
const c = document.createElement('canvas');
c.width = 280; c.height = 60;
const ctx = c.getContext('2d');
ctx.textBaseline = 'top';
ctx.font = '16px "Arial"';
ctx.fillStyle = '#f60';
ctx.fillRect(125, 1, 62, 20);
ctx.fillStyle = '#069';
ctx.fillText('Cwm fjord bank glyphs vext quiz, 🦊', 2, 15);
return c.toDataURL();
};
return { a: draw(), b: draw(), c: draw() };
}
Server-side: a real browser returns three identical strings. A noise-injecting antidetect browser returns three different strings — the per-pixel jitter is reseeded per getImageData/toDataURL call. Single-call equality with a peer baseline is not enough; the render-twice equality is.
More sophisticated antidetect browsers use a seeded noise that is stable per session. The counter to that is cross-session coherence: same declared GPU + OS + driver should produce one of a small known set of canvas hashes. A canvas hash that has never been observed for that hardware tuple, on a fleet with millions of users, is suspicious.
Pull UNMASKED_VENDOR_WEBGL and UNMASKED_RENDERER_WEBGL, then verify they are consistent with:
Maintain a reference table: declared GPU → expected (MAX_TEXTURE_SIZE, extensions set, precision tuple). Build this from your own real-user telemetry, not from public datasheets — vendors patch this. A profile claiming NVIDIA GeForce RTX 4070 with MAX_TEXTURE_SIZE=8192 is impossible; real hardware is 16384+.
The OfflineAudioContext fingerprint is deterministic for a given Chromium build. Spoofing it requires modifying the Web Audio implementation; the modification is detectable by checking the output against a known-good hash table for (Chromium major version, OS family).
async function audioProbe() {
const ctx = new OfflineAudioContext(1, 44100, 44100);
const osc = ctx.createOscillator();
osc.type = 'triangle';
osc.frequency.value = 1000;
const comp = ctx.createDynamicsCompressor();
osc.connect(comp); comp.connect(ctx.destination);
osc.start(0);
const buf = await ctx.startRendering();
// hash buf.getChannelData(0).slice(4500, 5000)
}
A non-listed hash for the declared (version, OS) raises the score. So does a hash that changes between two probes in the same page load — which only happens with naive runtime patching.
document.fonts.check('12px "Segoe UI"') returns true on Windows, false on macOS unless the user installed it. A profile declaring macOS that resolves Segoe UI, or declaring Windows but missing Calibri/Cambria, is a clean signal.
Better: measure rendered glyph widths via offscreen canvas (the same primitive antidetect browsers spoof) and compare against the expected distribution for the declared OS. Fingerprint.js publishes the foundational technique [3]; you reuse it as a consistency probe, not an identity probe.
None of these are dispositive alone. They feed into the score in §6.
Detecting that a browser is being driven by Playwright, Puppeteer, or Selenium is a separate dimension from detecting that the fingerprint is fake.
The classic ChromeDriver injection variables (cdc_adoQpoasnfa76pfcZLmcfl_* on document and window) are still present on stock Selenium [4], and most real-world operators run patched builds (undetected-chromedriver) that strip them. Treat cdc_ presence as a strong signal if found, and treat its absence as null information.
When a Chrome DevTools Protocol client calls Runtime.enable, the browser's runtime starts emitting Runtime.consoleAPICalled events and changes the behavior of console.* functions in detectable ways:
Function.prototype.toString.call(console.debug) returns the native string, but stack traces produced inside a console.debug call from page script have an extra frame in some Chromium builds. This has shifted across versions; build a current table from your own measurements.
performance.now() resolution is clamped to 100µs without site-isolation, but a CDP-attached page may show different jitter characteristics depending on whether Emulation.setCPUThrottlingRate was called.
The presence of an attached debugger affects Error.stack formatting in narrow cases (long stack traces, async boundaries).
These signals shift across Chromium versions. Do not hardcode them. Maintain a per-version detection table refreshed from your honeypot fleet.
A more stable signal: when Playwright connects via connect_over_cdp, the resulting browser context has measurable properties at attach time — for example, the order in which Page.frameAttached events arrive vs Runtime.executionContextCreated differs slightly between human-launched DevTools and headless-attached automation. This is best detected at the target layer if you control browser builds (you usually don't) or via behavioral timing once the page loads.
window.chrome shape, Notification.permission === 'denied' with 'default' in navigator.permissions.query, and missing plugins arrays remain partial signals against unmodified headless Chrome. Most antidetect operators do not use headless mode, so weight these low — but they catch the long tail.
This is where antidetect browsers struggle most, because the spoof has to happen below the browser process.
JA3 [5] is deprecated in favor of JA4 [6] (FoxIO, 2023+), which is more resilient to GREASE and reorderable extensions. The relevant signals for bot detection:
JA4 — TLS ClientHello fingerprint. A request claiming UA Chrome/138 on Windows 11 must produce a JA4 from the small set Chrome 138 actually emits.
JA4H — HTTP request fingerprint (header order, cookie order, Accept-Language).
HTTP/2 fingerprint — Akamai's research [7] showed the SETTINGS frame, HEADERS pseudo-header order, and WINDOW_UPDATE values cluster tightly per browser engine.
If User-Agent: Chrome/138 arrives with a JA4 matching curl/8.x or Go's net/http, the fingerprint is forged at the application layer and the network stack betrayed it. This catches scripts that proxy a real browser's traffic through a non-browser TLS stack — a common mistake in budget operations.
Antidetect browsers that use a real Chromium binary pass this check by definition. JA4 is therefore a strong signal against script-driven bots and a weak signal against kernel-patched real-browser operators. Combine accordingly.
Standard practice — but a few things worth pinning:
WebRTC ICE candidate enumeration leaks the local interface. Most antidetect browsers block it; blocking itself is a (mild) signal because real users rarely disable WebRTC.
Behavioral models catch what fingerprint models miss: a perfectly forged fingerprint driven by a script. Three signal families have held up well:
Pointer dynamics: Human mouse paths show jitter, sub-movements, and corrective overshoots. Bezier-curve-generated paths are too smooth in the second derivative. Train an LSTM or simple gradient-boosted classifier on (dx, dy, dt) triples; a single page-session of 30+ pointer events is usually enough.
Keystroke dynamics: Dwell time and flight time distributions per user are stable [8]; bots often produce uniform or normal distributions where real users produce log-normal.
Interaction Markov chains: Real users have characteristic transition probabilities between page elements (search → filter → click; or scroll → hover → click). Bots traverse pages in flat, breadth-first patterns. Cluster sessions by transition matrix and flag outliers.
Critically: behavioral signals must be evaluated server-side on raw event streams. Any client-side scoring is bypassable in 30 minutes.
A single-signal block-or-allow gate produces unacceptable false-positive rates. The architecture that works:
Two things that are easy to get wrong:
Friction, not blocks, is the default for the middle band. A high-confidence fingerprint coherence failure with a clean behavioral signature might be a privacy-conscious user with Brave's strict shields — friction is recoverable, blocking is a support ticket.
Score per surface, not per user. The same risk score should produce different actions on a login (high friction) vs a public listing page (low friction).
For implementation, a gradient-boosted tree (XGBoost/LightGBM) on the raw signals beats hand-tuned weights once you have a few thousand labeled sessions. Label sources: chargebacks, reported fake accounts, Trust & Safety review queues.
Honest list, because the field is full of stale advice:
navigator.webdriver === true as a primary signal — every commercial antidetect browser overrides this. Use as a tripwire only.
Blocking known datacenter ASNs alone — residential proxy networks (some of which are themselves a bot-net problem [9]) defeat this.
CAPTCHA as a first-line defense — solving services are cheap and fast. Use CAPTCHA as the friction step in the middle band, not the gate.
window.chrome shape checks — well-documented and patched by every serious operator since ~2022.
User-Agent string matching for bot detection — UA-CH improves things at the margin, but UA alone is decoration.
Q: We see thousands of canvas hashes per day that match no known device tuple. False positive or real?
A: Some of both. New devices launch, and Chromium minor updates shift hashes. Maintain a rolling 30-day baseline and a "new tuple" allowance period. The signal is delta in unknown-tuple rate per region, not the absolute count.
Q: Is JA4 enough to detect Playwright?
A: No. Playwright launching a real Chromium binary produces a real-Chromium JA4. JA4 catches non-browser TLS stacks (Go, Python requests, curl-impersonate variants if uncalibrated). For real-Chromium automation, you need Tier 1 + Tier 2 + Tier 4.
Q: Should we publish our detection signals?
A: Publish the categories and the defensive philosophy. Do not publish the version-specific tables (Chromium build → expected audio hash, etc.). Operators read security blogs too; the table is the moat.
Q: How do we handle privacy-respecting users (Brave, Tor, Librewolf)?
A: They will fail several coherence checks by design — that is the point of those browsers. Two practical mitigations: (1) maintain allow-listed coherence profiles for these browsers, and (2) make the friction band genuinely passable for real humans (a single CAPTCHA, not a ten-step verification).
Q: Where do behavioral models break?
A: Cold start (first session) and on accessibility tools. Always carve out exceptions for prefers-reduced-motion, screen reader signals, and switch-control input patterns. False-positiving a disabled user is both a product failure and, in some jurisdictions, a legal one.
[1]: Eckersley, P. How Unique Is Your Web Browser? PETS 2010 https://coveryourtracks.eff.org/static/browser-uniqueness.pdf
[2]: Acar, G. et al. The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. CCS 2014.
[3]: Fingerprint.js open-source library https://github.com/fingerprintjs/fingerprintjs
[4]: Selenium WebDriver source — chromedriver injection variables https://chromium.googlesource.com/chromium/src/+/refs/heads/main/chrome/test/chromedriver/
[5]: Salesforce JA3 — TLS client fingerprinting https://github.com/salesforce/ja3
[6]: FoxIO JA4+ specification https://github.com/FoxIO-LLC/ja4
We won't spam your inbox.
Comments :
Media buyer
May 13, 2026The Canvas linkage story is what finally convinced our finance lead to fund real antidetect seats.
ReplyAutomation lead
May 13, 2026API batch spin-up section matches how we run mornings in the trading desk.
ReplyReader
May 13, 2026Hardware-bound WebGL note should be mandatory reading before anyone touches creative accounts.
Reply