A 4,279-Extension Fingerprint: Reconstructing LinkedIn’s Chrome Extension Probing

Introduction

While inspecting client-side behavior on LinkedIn, I identified a routine that attempts to load resources from the chrome-extension:// protocol to detect installed browser extensions.

This is not a small compatibility check. The script iterates over a dataset of 4,279 extension identifiers, probing each one directly in the browser.

I extracted this dataset, resolved it against the Chrome Web Store, and analyzed its structure and distribution.

The full dataset, scraper, and analysis pipeline are available in the accompanying repository: https://github.com/leopoletto/linkedin-chrome-extension-probing

The Mechanism

The probing list is embedded client-side:

const o = [
  { id: "aaaeoelkococjpgngfokhbkkfiiegolp", file: "icon/16.png" },
  { id: "aabfjmnamlihmlicgeoogldnfaaklfon", file: "Assets/48.png" },
  { id: "aacbpggdjcblgnmgjgpkpddliddineni", file: "sidebar.html" },
  // ...
];

Each entry is tested via:

fetch(`chrome-extension://${id}/${file}`)

If the request succeeds, the extension is installed, it produces a deterministic presence signal for each extension.

Dataset Reconstruction

I extracted the array into JSON and resolved each ID using a Playwright-based scraper.

The dataset appears to prioritize coverage breadth over precision, maximizing the likelihood of detecting rare and distinguishing extensions.

Results

Total extensions in list: 4,279
Found in Web Store: 4,100
Not found/removed: 179

Interpretation

High coverage (~96%) → the dataset is recent or maintained
Removed entries (179) → indicates historical accumulation and lack of pruning

This is not a tightly curated list.

Execution Characteristics

From runtime observation:

~4,000+ fetch requests triggered, executed asynchronously (idle callback/batching), no early exit, full dataset traversal
This is not sampling, heuristic, not minimal. It is a complete enumeration of a predefined dataset

Request Initiator Chain

During inspection in Chrome DevTools, the extension probing requests originate from the following resource chain:

static.licdn.com/.../CfnLU5Sb.js
  → linkedin.com/preload/
    → static.licdn.com/.../18jojqmltgzmqumon771ha07q 
      → chrome-extension://invalid/

Interpretation

The probing logic is delivered via LinkedIn static assets
Execution is triggered during preload/runtime initialization
The chrome-extension://invalid/ requests appear when extension resources are not resolved

This provides a traceable origin of the behavior within LinkedIn’s client-side delivery pipeline.

Observation based on runtime inspection; may vary across builds or regions.

Distribution: Head vs Long Tail

High-adoption segment (≥100k users)

Only 69 extensions fall into this category, including:

Adobe Acrobat (336M)
Grammarly (43M)
Loom (8M)
DeepL (4M)

These represent:

Mainstream productivity tools
Writing/AI assistants
CRM / outreach tools

Long tail (majority of dataset)

Thousands of extensions fall below 100k users.

This includes:

Niche tools
Indie projects
Experimental extensions
Dev builds

This long tail is the primary source of high-entropy identification

Undisclosed / Private Segment (Important)

Some extensions returned users: null but have ratings.

Example:

{
  "title": "Comment Whisper - AI Comment Generator",
  "rating": 4.5,
  "ratingsCount": "4 ratings",
  "users": null
}

This means that missing user counts is different from zero users, which most likely represents:

Private/unlisted extensions
Enterprise tools
Restricted distribution
Early-stage products

Dataset impact

Extensions with undisclosed user counts: ~50
“Probably private” (null users + ratings): 222

Category Signals

Across both public and private segments, extensions cluster around:

Outreach/prospecting
Job automation / auto-apply
AI writing and messaging
CRM integrations
Scraping / data extraction

Examples include:

“LinkedIn AI Assistant”
“Auto Apply Job Copilot”
“CRM Connector”
“AI Comment Generator”

These categories closely align with LinkedIn’s core workflows:

Recruiting
Job search
Sales outreach
Content creation

Privacy Capability Surface (Clarified)

From Chrome Web Store disclosures, these extensions collectively declare access to:

Personally identifiable information (1,754)
Website content (1,642)
Authentication information (950)
User activity (568)
Personal communications (339)

Important clarification:

This reflects what the extensions can access, not what LinkedIn collects.

What This Enables

1) Deterministic extension detection

Unlike canvas or audio fingerprinting, no inference, noise or binary result is required.

2) High-entropy fingerprinting

Extension combinations are sparse, user-specific and highly distinguishing. Including rare or private extensions increases uniqueness significantly.

3) Capability inference (key insight)

Extensions reveal more than identity. They reveal user workflows, tooling sophistication and automation capabilities.

Examples:

CRM tools → sales workflows
scraping tools → automation
AI assistants → content generation

This enables a profiling based on capability, not just identity.

Chrome-Specific Behavior

This mechanism depends on chrome-extension://, it applies to Chromium-based browsers, not to Firefox (moz-extension://)

What This Is (Precise Framing)

A client-side enumeration routine that iterates over a large, automatically maintained dataset of Chrome extension identifiers, probing each via the chrome-extension:// protocol to determine installation presence.

What this is not

This is not a vulnerability, or a proof of malicious intent, neither a user-specific targeting, but an observable behavior using available browser capabilities

External Context

A broader public investigation has explored similar behavior at scale:

https://browsergate.eu/

That work reports large-scale extension probing and discusses potential implications.

This article focuses strictly on:

Observable client-side behavior
Dataset reconstruction
Distribution analysis

No assumptions are made about intent or server-side usage.

Final Thoughts

Most discussions about browser tracking focus on cookies, localStorage and third-party scripts.

This mechanism operates differently. It actively enumerates the browser environment using a large, prebuilt dataset at scale, and entirely client-side.

Closing Insight

The dataset’s structure, combining high-adoption tools, niche extensions, and private distributions, suggests broad ingestion rather than selective curation.

This transforms extension detection from a simple compatibility check into a high-entropy, capability-aware signal about the user’s environment.

Full dataset and reproducibility details are available in the repository: https://github.com/leopoletto/linkedin-chrome-extension-probing.