Skip to main content
Leonardo Poletto

Hi there, Software Engineer, focusing on web standards, accessibility, privacy, and performance.
Creates tools for insights and empowers teams through education.

Published:
Last updated:

The Technical State of Web Privacy: A 2025 Data-Driven Analysis

A data-driven analysis of how modern browsers, trackers, and standards interact to recognize users—examining cookies, fingerprinting, consent frameworks, and security controls shaping web privacy in 2025.

1. Introduction: The Observational Reality of Web Tracking

The modern web operates as a functionally stateful environment, where the ability to recognize users across sessions has transitioned from a functional requirement for session management into a standard operational component of data collection. Measuring these behaviors is strategically vital for establishing a baseline of user privacy in an ecosystem where data gathering is nearly universal. In the context of the 2025 Web Almanac, we distinguish between "stateful" tracking—which utilizes local storage mechanisms like cookies and localStorage to persist identity—and "stateless" tracking, often termed "entropy-based fingerprinting," which identifies users by observing unique system configurations at runtime.

The prevalence of tracking remains the baseline reality of the web. Data-driven analysis of the July 2025 crawl indicates that approximately 75% of desktop and 74% of mobile pages contain at least one observed third-party tracker. In practice, this observation is most clearly reflected in cookie behavior: across the web, third-party cookies consistently outnumber first-party cookies, especially among high-traffic sites, underscoring how stateful mechanisms remain central to cross-site observation.

Source: 2025 Web Almanac – Cookies: First- and third-party prevalence by rank

The strategic role of cookies has evolved from a simple solution for the statelessness of the HTTP protocol to a sophisticated infrastructure for persistent user recognition. While first-party cookies are required for functional continuity, third-party cookies remain the primary engine for the cross-site tracking industry. Despite increased browser-level scrutiny, the current landscape reveals a significant reliance on third-party state:

Cookie Type Desktop Prevalence Mobile Prevalence
First-Party (1P) 41% 40%
Third-Party (3P) 59% 60%

Source: 2025 Web Almanac - Cookies: First and third-party prevalence by client

Technical efforts to mitigate cross-site tracking, such as the Partitioned (CHIPS) attribute, show emerging adoption at approximately 9% for third-party cookies—a slight increase from the previous year. However, the legacy SameSite=None attribute remains near-universal for third-party cookies, ensuring they continue to be transmitted in cross-site contexts. This persistence is a critical identity anchor; the median expiration for both first- and third-party cookies is 365 days, often hitting Chrome’s 400-day hard limit. Crucially, ephemeral "Session Cookies" represent only 19% of first-party and a mere 7% of third-party cookies, confirming that the ecosystem is optimized for long-term identification rather than transient sessions. As these cookies become increasingly partitioned, this 365-day persistence strongly anchors identity within specific sites, facilitating re-identification through the highly concentrated group of providers that dominate the third-party infrastructure.

Source: 2025 Web Almanac - Cookies: Partitioned (CHIPS proposal)

3. Market Concentration: The Third-Party Infrastructure

The concentration of third-party providers represents a centralizing force in web privacy, where a remarkably small number of entities observe a vast majority of global web traffic. This concentration allows a single provider to synchronize user activity across millions of disparate domains.

The dominance of major providers is exemplified by their massive site reach:

  1. Google (via doubleclick.net, google.com, and youtube.com): Reaches at least 33% of all websites, with doubleclick.net alone serving as the primary engine on 20% of sites.
  2. Meta (including facebook.com and Meta Pixel): Present on at least 22% of pages.
  3. Microsoft (including bing.com, clarity.ms, and linkedin.com): Coverage reaching at least 14% of sites.

Source: 2025 Web Almanac - Third Parties: Top third parties by the number of pages.

While CDNs dominate third-party infrastructure by prevalence, their role is primarily content delivery rather than user identification, though they still mediate request metadata:

  • CDNs: 74%
  • Advertising: 59%
  • Essential/Tag Managers: 55%

When stateful mechanisms are restricted, these concentrated entities often pivot toward stateless recognition methods to maintain observation.

Source: 2025 Web Almanac - Third Parties: Distribution of the third-party request categories by rank

4. Stateless Recognition: Browser Fingerprinting and Evasion

Stateless tracking, or fingerprinting, is a strategic alternative to cookies that relies on entropy-based identification. By generating identifiers "on the fly" from system configurations, trackers can recognize users without relying on local storage, bypassing user-accessible "opt-out" mechanisms like cookie clearing.

The adoption of dedicated fingerprinting libraries remains a niche but persistent threat:

  • FingerprintJS: The dominant library, identified on 0.59% of mobile sites.
  • ClientJS: A secondary competitor with significantly lower adoption (0.04%).

Sophisticated evasion techniques like "Bounce Tracking"—briefly redirecting users through a tracking domain—show extremely low prevalence (e.g., medium.com at 0.0003%). This decline is a clear indication that browser-level mitigations, specifically Chrome’s bounce tracking protections, are effectively altering the landscape. The significance of these stateless methods lies in their ability to paint a unique picture of a user via hardware configurations, system fonts, and language settings. Unlike cookies, these signals are often fundamental to the device's functionality, making them difficult to block without degrading the user experience. This technical reality has accelerated the shift toward browser-led privacy policies.

5. Browser-Led Privacy Policies: Client Hints and Referrers

Browser vendors are undergoing a transition from passive data exposure to active, permission-based signals. This is most evident in the migration from the high-entropy User-Agent string to "User-Agent Client Hints," which require servers to explicitly request device details.

Current adoption rates for Client Hints indicate a steady but concentrated transition:

Environment Adoption Rate Primary Use Case
Desktop 3.3% Browser/OS Feature Detection
Mobile 5.1% Responsive Design / Debugging
sec-ch-ua-platform-version 4.28% OS Compatibility

Referrer Policies allow sites to technically limit the leakage of browsing paths to third parties. The most common value is strict-origin-when-cross-origin (5.69%), which shares the origin but strips the full path during cross-site navigation. Additionally, Origin Trials act as temporary technical bridges for legacy implementations. The DisableThirdPartyStoragePartitioning trial is the most widely adopted (12.33%), functioning as a temporary delay for sites not yet prepared for the browser's mandatory storage isolation. These technical signals eventually interface with formal, though often flawed, consent frameworks.

Standardized consent propagation is a strategic requirement for the advertising ecosystem, yet a significant "Compliance Gap" persists between the detection of framework code and technical compliance.

The prevalence of IAB frameworks is distributed as follows:

  • TCFv2 (Transparency and Consent Framework): 3.8%
  • USP (US Privacy String): 3.3%
  • GPP (Global Privacy Platform): 0.9%

The discrepancy between TCFv2 detection (3.8%) and full technical compliance (1.7%) highlights a landscape of "compliance theater," where implementations are often incomplete or improperly configured. This is further evidenced by the high prevalence of meaningless USP strings like 1--- (1.07%), which provide no valid privacy signal. Furthermore, while the legacy "Do Not Track" (DNT) signal persists on 43% of the top 10k sites, it drops to 27% in the "long tail" (500k tier), illustrating the burden of legacy code without legal enforcement. In contrast, the Global Privacy Control (GPC) is emerging as a legally backed successor, carrying weight under certain regulatory regimes (e.g., CCPA/CPRA).

7. Security Mechanisms as Privacy Buffers

The intersection of security and privacy is a critical technical boundary; headers designed to mitigate attacks also serve as privacy buffers by preventing "side-channel leaks" and unauthorized content inclusion.

  • Content Security Policy (CSP): Adoption has seen an 18% relative increase (reaching ~22%). Directives like frame-ancestors (anti-clickjacking) and upgrade-insecure-requests (enforced encryption) are the primary drivers of this growth.
  • Subresource Integrity (SRI): While present on 23.6% of mobile pages, the median coverage per page remains anemic at 2.82% of scripts, leaving the vast majority of resources unprotected from CDN-level tampering.
  • Cross-Origin Policies: Headers such as COOP, COEP, and CORP prevent side-channel leaks (e.g., Spectre-style attacks) by enforcing process isolation. Adoption for these policies is growing, with all currently near or above 2%.

These mechanisms provide technical guardrails that prevent third-party scripts from overreaching their intended scope, though their efficacy remains dependent on precise developer implementation.

8. Methodology and Analytical Caveats

Interpreting the findings of the 2025 Web Almanac requires an understanding of the dataset's "Ground Truth." The analysis is based on a July 2025 crawl of approximately 16 million sites drawn from the Chrome User Experience (CrUX) dataset.

Key Analytical Limitations:

  • Home Page Bias: The crawl analyzes home pages, which may not reflect the more aggressive tracking behavior often found on secondary or authenticated pages.
  • Empty Cache/Logged-Out State: Pages are tested without pre-existing cookies or active sessions, potentially missing trackers that activate only for returning users.
  • The Lower Bound Reality: These statistics represent a "lower bound" of the tracking landscape. Modern techniques such as CNAME cloaking (disguising third parties as first-party subdomains) and Server-Side Tracking (SST) (routing data server-to-server) are not directly observable in client-side data.

Ultimately, this analysis provides a measurement-based view of a web in transition—one moving toward a more partitioned, browser-mediated model of user recognition, even as tracking persistence remains deeply embedded in the protocol's architectural compliance—staying informed is your best defense.