How Many Sites Are Still Running Third Party Cookies?
Categories: analytics
The cookies are dying! Run for the hills! Save yourselves!
Third party cookies (3PCs) are one of the most covered topics in digital marketing right now, yet somehow still remain one of the most confusing.
Let’s start with some quick facts:
- Chrome is the only major browser that doesn’t heavily restrict 3P cookies already.
- Google has previously delayed 3P cookie sunset multiple times, but seems poised to do it this time starting Q4 2024.
- “Cookieless” doesn’t mean we’re getting rid of all cookies, it means we’re getting rid of 3P cookies.
- 3PCs are predominantly, but not exclusively, used for advertising.
I wanted to investigate where things stand for the impending cookiepocalypse, to try and better understand what types of sites are running what sorts of 3PCs.
TL;DR: if you’ve got ad tags, you’ve (in all likelihood) got 3PCs — the more ad tags you have and the deeper the networks go to find you ads the more 3PCs you’ll have.
As you may have already deduced from my sarcastic hyperbole, I don’t think this is much of a crisis. Users are very unlikely to experience broken sites. They may notice less targeted ads, but that’s about it.
However, I also think it’s a mistake to dismiss this as “no big deal”. It will greatly affect the digital advertising markets, a big part of what pays for our entire internet ecosystem. Apple ITP has already significantly changed digital marketing, and Chrome is used by roughly 3x as many users.
eMarketer estimates that over $600B was spent on digital ads worldwide last year, and projects that this number will continue to increase. How big is that number? It’s in the same ballpark as: the GDP of Sweden ($621B), US defense spending ($773B+), or the entire consumer electronics market ($685B). Whether you think third-party cookies were a terrible mistake or not, they’ve been a big part of the digital ad industry for decades.
When is this going to start?
Chrome started blocking 3P cookies for 1% of users in January 2024, so in a small way it’s already started. Google says on the Privacy Sandbox site that Chrome will start to phase out support in Q3 2024. Here’s the calendar they provide:
This timeline doesn’t mean that on August 15th we go from 1% of Chrome to 100%, it says there will be a gradual phase out. However the timeline graphic also implies the phase out will be completed October 15, 2024… just in time for the holiday 2024 ad season to really get underway. That would definitely make me nervous if I was a big ad buyer or network.
Google has not yet provided specific details on when the phase out will be completed, or what they mean by “gradual phase out”. It may well be that they simply do not know and plan to adjust it depending on how things go.
I have serious doubts as to whether phaseout will happen so quickly as the graphic implies. If the point of a gradual phase out is to give people more time to get accustomed to the new systems and fix things when they break, a three month window is not a lot of time to do that in. I’m sure we all remember the Google Analytics Universal sunset last year that started on July 1, 2023 and ran until at least October — and wasn’t even announced as a gradual phase out.
Getting the industry to believe these changes are happening and that we should start paying attention is presumably the reason for the 1% test now, but so far it seems to be driving more talk than active change. Similar to the UA sunset, I’m not expecting Google to delay the launch as much as I’m expecting them to take their time rolling it out to everyone.
What could delay this?
- Significant functionality breakages
- Anti-competitive regulatory action
- Lack of privacy sandbox adoption
The first item, functionality breakages, is pretty straight-forward. As stated, 3P cookies are mostly used for advertising. As a user, if you turn off 1P cookies in your browser you’ll have a hard time using the web. Logins won’t work, shopping carts won’t work, etc. If you turn off 3P cookies, you probably won’t notice any difference.
After all, if turning off 3P cookies broke significant user functionality, wouldn’t sites have been hearing about it from the ~30% of users that already restrict cookies?
That said, the turndown does affect some user functionality. There are currently 162 open bugs in the the Chrome 3PC issue tracker. In the context of billions of internet users and sites that’s not a very large number, but if it affects your site it still can be a big deal. Many of these issues are related to cross-site or embedded iframe logins, which is about what we should expect.
Ultimately, functionality breakage does not seem to be a show-stopper here.
So what about #2, anti-competitive regulatory action? Because it’s been argued that the Privacy Sandbox gives Google an unfair competitive advantage, it’s possible that a governmental agency may come in and say Google can’t do what they are planning to do. The issue there is not that Google is turning off 3PCs, but that the APIs in the Sandbox give Google an unfair advantage vs. other ad networks.
In particular, the UK’s Competition and Markets Authority (CMA) has binding commitments from Google that have to be resolved before the turndown can happen. The timeline for this happening before July or August is tight, though certainly possible. Other governmental agencies may also say that Google has to give others better toys to play inside their sandbox, but at the moment the CMA is the only one I’m aware of.
Lack of adoption (#3) of the Privacy Sandbox is essentially a corollary of #2. If more ad networks and publishers were already on board and successfully using Google’s alternatives for 3P cookies, then the monopoly concerns would be lessened. The conundrum for networks and publishers is that 3P cookies typically provide more detailed information than the ad API features in the sandbox do… so why should they put a lot of effort into the Privacy Sandbox until they absolutely have to? Other third-party cookie alternatives like UID2 from The Trade Desk or LiveRamp ATS provide more clear and understandable benefits to networks and publishers. These non-Google alternatives are largely based upon users self-identifying themselves, i.e. providing an email address and then having that hashed email sent to the ad networks as a unique identifier.
It’s likely Google wouldn’t want to turn off 3PCs until they have a replacement which covers a significant amount of whatever loss of ad effectiveness and revenue that the industry would see from just cutting out 3PC tech cold turkey. Google did a study in 2019 disabling 3PC for some users which showed an average revenue loss of 52% for sites which had 3PCs disabled. That was a huge number that really chilled the migration.
The results varied a lot per website, in particular news websites were especially hard hit, with an average loss of 62%. If you run an ad-supported website and face losing half your ad revenue, even if the turndown of 3PC doesn’t break any functionality it could still kill your website. Even if you are rightly suspicious of Google grading its own homework, it’s clear that as the cornerstone of the ecosystem Google is incentivized to be cautious.
Firefox and Safari killed 3PC years ago because they don’t have that concern. It stands to reason that Google must believe that their new ad API tech in the Privacy Sandbox is at a point now where it can cover at least a substantial amount of that ad revenue loss. If the Sandbox disproportionally helps Google over other ad networks that’s not exactly a big problem for Google.
How widely are 3PCs really used?
Now it’s time to see where we’re actually at with 3PC usage. Here’s how I tested this:
- Took a list of the top 1M websites from Tranco, and cut it down to the top 100 (focused on sites people actually visit, not tracking domains, utility domains, etc. I also excluded Russian and Chinese sites).
- Used Google’s Privacy Sandbox Analysis Tool (PSAT) in command-line mode to report on those sites. This tool spiders a site, looks for all the cookies, then classifies them using the Open Cookie Database. It’s pretty easy to run, all I did was install the PSAT CLI tool:
git clone https://github.com/GoogleChromeLabs/ps-analysis-tool.git; npm install; npm run cli:build
then run my list of websites against it like so:
npm run cli -- -nh -nt -ul 100 -c ./top100.csv -d crawler_output
Importantly the “-nh” flag was necessary to make sure that it was run with an actual browser attached. When run in headless mode it was missing most of the cookies that got set, potentially from ads that knew they were not visible to a user. Another word of warning was that my build of this did not correctly close Chrome test browsers after it fired them off, so I had to do that manually or I would have ended up with dozens of browsers running at the same time.
This number from the PSAT should be thought of as a very conservative count for the following reasons:
- The PSAT tool can’t click “consent” on cookie notices. Luckily (for this experiment anyways), most of these sites don’t have functional cookie consents prompts from my US-based IP. For those few that do, consenting would certainly raise the number of 3PCs. For example sciencedirect.com goes from one 3PCs to five 3PCs if you click “Accept All” on their prompt.
- I only tested the homepage of sites and didn’t log in. Many site homepages like twitter, roblox, and facebook don’t have content on their homepages but rather just login prompts. This likely correlates with less cookies.
- The PSAT tool by default rejects 3PCs. It’s intended to emulate how Chrome will work when 3PCs are disabled, so it counts how many cookies were blocked. Sounds reasonable, except that some ad networks (especially Google’s own network) check to see if 3P cookies are enabled, and then will set more if so. Check out this example of how the number of 3PCs on www.nytimes.com goes way up (16 to 78) when you actually accept 3PC.
The above caveats out of the way, we still saw plenty of third party cookies. 69/100 of these top sites had at least 1 3PC. While the median was just 3, the average was 9.2 due to some sites with a very large amount.
The grand champion with 119 third party cookies was dailymail.co.uk. That’s just them getting warmed up though, it set 374 third party cookies once I turned 3PC on (and I didn’t even click “Accept All”!).
The Daily Mail is an inextinguishable toxic underground coal fire, but it does also highlight how it’s the news media sites — even the responsible ones — that have the highest number of 3PCs. News sites heavily rely on ads of course, and because only a small portion of their site’s traffic is likely to be logged in they can’t get as much out of 3PC alternatives like LiveRamp ATS that rely on logged-in or otherwise self-identified users. These news sites particularly need Privacy Sandbox to be effective, but they are still pressing on 3PCs as hard as they can.
To be clear, what the cookies are and how they are used is far more important than the raw numbers. A site could only have one 3PC and if it was used for purchase fraud detection losing that cookie could be a real problem. A media site like Daily Mail might have dumped hundreds of pointless audience cookies on a user in an attempt to squeeze a tiny bit more revenue out of their existing traffic — but that doesn’t mean the adtech provider collecting the data actually used it for anything. Compare that to Facebook, which we saw zero 3PCs from in this test… but certainly should not be equated to them being a privacy-friendly enterprise.
So of all these 3PCs that we see, what do they do? Or at least, what does the cookie database that the PSAT uses think they do?
Unsurprisingly, the marketing/advertising cookies make up the majority. In fact, they appear to be most of the uncategorized and functionality cookies as well. The top 3PCs I found were:
Cookie Count | Owner | Open Cookie Database Category | Cookie name | Domain | Description |
42 | Functional | test_cookie | doubleclick.net | Checks for 3PC support | |
25 | Microsoft | Marketing | MR | bing.com or microsoft.com | Refresh flag for MUID cookie |
24 | Microsoft | Marketing | MUID | bing.com or microsoft.com | Microsoft unique ID |
22 | Marketing | lidc | linkedin.com | For data center selection | |
22 | Marketing | bcookie | linkedin.com | Unique identifier for LinkedIn | |
21 | Marketing | li_sugr | linkedin.com | Probabilistic LinkedIn ID | |
18 | The Trade Desk | Marketing | TDID | adsrvr.org | Trade Desk unique ID |
17 | Adobe | Marketing | demdex | demdex.net | Adobe Audience Manager unique ID |
16 | Cloudflare | Functional | __cf_bm | current site (enabled for cross-site) | bot detection |
Because Google is checking with the “test_cookie” before they attempt to set other 3PCs, we don’t see their marketing cookies in this list.
How do we know what category a cookie belongs to? As mentioned, the PSAT tool uses the Open Cookie Database — which is a list of about 1,400 different cookie names, their function, and the domains they belong to. It’s also fairly generous in how it identifies “functionality cookies” — considering the “test_cookie” from doubleclick to be related to functionality, even though its function is to check if the ad network can set 3PCs.
Let’s dig a little deeper into those “functional” cookies. That our list showed 13.7% of cookies in that category might make you concerned that there’s a lot of things like login cookies that are getting blocked, but the primary purposes of those functionality cookies were:
- checking for third party cookie support
- e.g. test_cookie from doubleclick.net.
- maintaining consent status
- e.g. OptanonConsent from OneTrust
- load balancer stickyness
- e.g. AWSALB from AWS
- anti-bot and fraud checking
- e.g. __cf_bm from Cloudflare
Our investigation here shows that sites are still using a lot of 3PCs, and that as expected the vast majority of those are related to advertising.
What about Privacy Sandbox adoption?
This leads to the obvious question about how widely the Privacy Sandbox is being adopted. While I’ve referenced Privacy Sandbox quite a few times already, I haven’t really talked about what it is other to say it’s a replacement for 3PCs.
I’m not going to go deep into the Sandbox, both because this is an article about cookies and also because I’m still just learning the Sandbox. The Sandbox contains “over 20 APIs“, though the two that are most relevant to us here for advertising purposes are the Topics API and the Protected Audiences.
Let’s look briefly at the Topics API, which is a way that your browser identifies what general topics you are interested in based upon the sites you visit. There are currently 469 topics in the topics taxonomy, which is intended to be a compromise between providing enough details such that it’s useful to advertisers… but also not enough detail to negatively impact privacy.
In my opinion it’s not great on either front. 469 topics just doesn’t give a lot of detail for advertisers. For example, /Sports/Tennis is a category, but ping-pong, pickleball, and racquetball aren’t. So if I was in the market for a new pickleball paddle the ad networks wouldn’t be able to tell via Topics (though the Protected Audiences API might be able to help). Yet when you combine all of the Topics that you may have picked up around the web you can end up with a fairly unique fingerprint. Chrome takes measures to reduce the entropy of that, though I’m not familiar enough with the methodology to say how effective that is.
The major ad networks are indeed already using the new APIs, though whether they are actually delivering ads based upon them and how well those ads are doing is harder to suss out. As an example, here’s the top five Topics I was assigned based on visiting the sites in my test list.
Since the top 100 sites on my list are all pretty broad the list ends up as very generic. If instead I go to top sites in a Google search for “tennis” the sites in that list get classified in the following way:
Host | Topics |
---|---|
www.tennis.com | 328. Tennis 299. Sports |
www.wikipedia.org | |
www.wtatennis.com | 328. Tennis 299. Sports 243. News |
www.espn.com | 299. Sports 363. TV & Video 243. News |
www.itftennis.com | 299. Sports |
www.atptour.com | 328. Tennis 299. Sports 243. News |
www.tennischannel.com | 328. Tennis 299. Sports |
So if Chrome users have these ad APIs on, then the networks are getting this data… which the major ones do seem to already be taking advantage of. Though ultimately if these new APIs were driving strong performance then I would expect to be seeing less 3PCs.
What does this mean for digital marketers and analysts?
In the immortal words of the Hitchhiker’s Guide: “Don’t Panic!”. Most of the heavy lifting here is the responsibility of the ad networks. I recommend using the PSAT tool to evaluate what 3PCs you have currently on your site. First pay special attention to any ones that aren’t ad-related, as it’s possible you may need to talk to vendors of things like single-sign on or bot detection.
If you do run ads, it’s likely that you’ve already received notifications from any ad networks you use about their “cookieless” solutions. Now is definitely the time to investigate what they may be offering — since as mentioned earlier Privacy Sandbox is not the only option and is Chrome-specific.
While the Privacy Sandbox can be pretty daunting with all its different APIs, the main 3 to pay attention to are: Topics, Protected Audience (custom audiences & remarketing), and Attribution Reporting. We’ve mostly been talking about targeting, but of course 3PCs affect attribution as well and Sandbox has something for that too.
Remember that these are not designed to be a full replacement for everything that you can do with 3PC, but a modernization with some privacy guardrails.
As we get closer to the phase out date, I will try to update this article with newer numbers of 3PCs as well as any other changes in regulations or sandbox adoption.
No comments yet.