Skip to main content
Leonardo Poletto

Hi there, Software Engineer, focusing on web standards, accessibility, privacy, and performance.
Creates tools for insights and empowers teams through education.

Published:
Last updated:

A 4,279-Extension Fingerprint: Reconstructing LinkedIn’s Chrome Extension Probing

LinkedIn Enumerates 4,279 Chrome Extensions via chrome-extension:// — Dataset, Scraper & Analysis

Introduction

While inspecting client-side behavior on LinkedIn, I identified a routine that attempts to load resources from the chrome-extension:// protocol to detect installed browser extensions.

This is not a small compatibility check. The script iterates over a dataset of 4,279 extension identifiers, probing each one directly in the browser.

I extracted this dataset, resolved it against the Chrome Web Store, and analyzed its structure and distribution.

The full dataset, scraper, and analysis pipeline are available in the accompanying repository: https://github.com/leopoletto/linkedin-chrome-extension-probing

The Mechanism

The probing list is embedded client-side:

const o = [
  { id: "aaaeoelkococjpgngfokhbkkfiiegolp", file: "icon/16.png" },
  { id: "aabfjmnamlihmlicgeoogldnfaaklfon", file: "Assets/48.png" },
  { id: "aacbpggdjcblgnmgjgpkpddliddineni", file: "sidebar.html" },
  // ...
];

Each entry is tested via:

fetch(`chrome-extension://${id}/${file}`)

If the request succeeds, the extension is installed, it produces a deterministic presence signal for each extension.

Dataset Reconstruction

I extracted the array into JSON and resolved each ID using a Playwright-based scraper.

The dataset appears to prioritize coverage breadth over precision, maximizing the likelihood of detecting rare and distinguishing extensions.

Results

  • Total extensions in list: 4,279
  • Found in Web Store: 4,100
  • Not found/removed: 179

Interpretation

  • High coverage (~96%) → the dataset is recent or maintained
  • Removed entries (179) → indicates historical accumulation and lack of pruning

This is not a tightly curated list.

Execution Characteristics

From runtime observation:

  • ~4,000+ fetch requests triggered, executed asynchronously (idle callback/batching), no early exit, full dataset traversal
  • This is not sampling, heuristic, not minimal. It is a complete enumeration of a predefined dataset

Request Initiator Chain

During inspection in Chrome DevTools, the extension probing requests originate from the following resource chain:

static.licdn.com/.../CfnLU5Sb.js
  → linkedin.com/preload/
    → static.licdn.com/.../18jojqmltgzmqumon771ha07q 
      → chrome-extension://invalid/

Interpretation

  • The probing logic is delivered via LinkedIn static assets
  • Execution is triggered during preload/runtime initialization
  • The chrome-extension://invalid/ requests appear when extension resources are not resolved

This provides a traceable origin of the behavior within LinkedIn’s client-side delivery pipeline.

Observation based on runtime inspection; may vary across builds or regions.

Distribution: Head vs Long Tail

High-adoption segment (≥100k users)

Only 69 extensions fall into this category, including:

  • Adobe Acrobat (336M)
  • Grammarly (43M)
  • Loom (8M)
  • DeepL (4M)

These represent:

  • Mainstream productivity tools
  • Writing/AI assistants
  • CRM / outreach tools

Long tail (majority of dataset)

Thousands of extensions fall below 100k users.

This includes:

  • Niche tools
  • Indie projects
  • Experimental extensions
  • Dev builds

This long tail is the primary source of high-entropy identification

Undisclosed / Private Segment (Important)

Some extensions returned users: null but have ratings.

Example:

{
  "title": "Comment Whisper - AI Comment Generator",
  "rating": 4.5,
  "ratingsCount": "4 ratings",
  "users": null
}

This means that missing user counts is different from zero users, which most likely represents:

  • Private/unlisted extensions
  • Enterprise tools
  • Restricted distribution
  • Early-stage products

Dataset impact

  • Extensions with undisclosed user counts: ~50
  • “Probably private” (null users + ratings): 222

Category Signals

Across both public and private segments, extensions cluster around:

  • Outreach/prospecting
  • Job automation / auto-apply
  • AI writing and messaging
  • CRM integrations
  • Scraping / data extraction

Examples include:

  • “LinkedIn AI Assistant”
  • “Auto Apply Job Copilot”
  • “CRM Connector”
  • “AI Comment Generator”

These categories closely align with LinkedIn’s core workflows:

  • Recruiting
  • Job search
  • Sales outreach
  • Content creation

Privacy Capability Surface (Clarified)

From Chrome Web Store disclosures, these extensions collectively declare access to:

  • Personally identifiable information (1,754)
  • Website content (1,642)
  • Authentication information (950)
  • User activity (568)
  • Personal communications (339)

Important clarification:

This reflects what the extensions can access, not what LinkedIn collects.

What This Enables

1) Deterministic extension detection

Unlike canvas or audio fingerprinting, no inference, noise or binary result is required.

2) High-entropy fingerprinting

Extension combinations are sparse, user-specific and highly distinguishing. Including rare or private extensions increases uniqueness significantly.

3) Capability inference (key insight)

Extensions reveal more than identity. They reveal user workflows, tooling sophistication and automation capabilities.

Examples:

  • CRM tools → sales workflows
  • scraping tools → automation
  • AI assistants → content generation

This enables a profiling based on capability, not just identity.

Chrome-Specific Behavior

This mechanism depends on chrome-extension://, it applies to Chromium-based browsers, not to Firefox (moz-extension://)

What This Is (Precise Framing)

A client-side enumeration routine that iterates over a large, automatically maintained dataset of Chrome extension identifiers, probing each via the chrome-extension:// protocol to determine installation presence.

What this is not

This is not a vulnerability, or a proof of malicious intent, neither a user-specific targeting, but an observable behavior using available browser capabilities

External Context

A broader public investigation has explored similar behavior at scale:

That work reports large-scale extension probing and discusses potential implications.

This article focuses strictly on:

  • Observable client-side behavior
  • Dataset reconstruction
  • Distribution analysis

No assumptions are made about intent or server-side usage.

Final Thoughts

Most discussions about browser tracking focus on cookies, localStorage and third-party scripts.

This mechanism operates differently. It actively enumerates the browser environment using a large, prebuilt dataset at scale, and entirely client-side.

Closing Insight

The dataset’s structure, combining high-adoption tools, niche extensions, and private distributions, suggests broad ingestion rather than selective curation.

This transforms extension detection from a simple compatibility check into a high-entropy, capability-aware signal about the user’s environment.

Full dataset and reproducibility details are available in the repository: https://github.com/leopoletto/linkedin-chrome-extension-probing.