Google Analytics Referral Spam Filter Wizard

If you have a Google Analytics account you’ve probably recently been frustrated  by the amount of spam showing up with bogus referrers, events, and other junk data. Referral spam in web analytics reports has been around for a long time, but it only started showing up in Google around the end of 2014. Google is working on the issue and seems to be blocking some of it, but new spam continues to crop up. This method will stop current spammers, and can be re-run again in the future as new spammers arise.

Start Referral Spam Filter Wizard

This tool will automatically insert filters to block new referral spam.
It uses two different methods in combination:

1. Blocking analytics calls from sites other than your own (or other hosts you select, the tool will walk you through the sites calling your GA).
This eliminates most of the so-called “ghost” referrals where your site was never involved in the measurement calls.

2. Explicitly blocking other spammers that get around the hostname-limitation method, either by actually hitting your site or by forging the hostname in a measurement call.

If you already have existing spam filters these new filters should not conflict with them.

FAQ

What method do you use? What spam hosts are blocked?
I use a hostnames allowed list as well as a blacklist, which means most spammers are blocked before ever getting into your reports and the filters do not need to be updated frequently. If you run your tracker on a large number of different hosts this method will not work.

I’ve gotten the list of spam hosts from spam referrals I’ve observed as well as a by the definitive guide to removing referral spam by Mike Sullivan at Analytics Edge.  Also many thanks to Mike for helping me test and improve the process! Mike’s article provides great detail about how the process works and why.

I’ve added the filters, but there’s still spam in my account.
Is it old spam?
The filters will stop new spam referrals from coming in, but they can’t filter out data already in your account. To exclude data already in your account you will need to use a segment to hide that data.

While using the tool you will be prompted with links to import a segment that will allow you exclude that data. If you’ve already inserted the filters and want just the segment it is available to import from GA solutions gallery here,
simply replace myhostname\.com with your domain.

Is it new spam?
New spammer that get around hostname filtering do come up from time to time. If the filters were working but now have some new spam you may need to update the filter. Simply run the tool again and it will update your existing filters assuming you did not change their name.

I added the filters, and now I’m missing traffic!
Did you whitelist all the domains you use?
While using the tool you will be presented with a list of domains receiving traffic. Any domain that was not included automatically (by your property configuration) or explicitly will not be included in your reports. In the case of translate.googleusercontent.com I generally advise NOT including it in your whitelist even though it could include some legitimate traffic since it also may include a lot of spam.

Do you run your analytics on many different domains?
This method won’t work for sites that use their tracker on a large number of different domains that can’t be enumerated into a whitelist. For those people I’d recommend trying Simo Ahava’s Spam Filter Insertion Tool, which doesn’t use hostnames but has a large list of spam domains to block.

Is it safe to authenticate this tool with Google?

When running the tool you will be prompted to grant access to this tool to edit your analytics account. We don’t save any of your data on our side, not even the names of the domains you added the filters to. We count how many views have run through it to keep track of total usage but that’s all. Additionally once the tool has been run it is done editing your account, so at that point you could remove our access.

10 Responses to “Google Analytics Referral Spam Filter Wizard”

  1. Sam Hoisington August 17, 2015 at 2:22 am #

    Okay, Jason, this is awesome. Thanks so much. A couple clicks and I’ve got greatly improved referral insight.

    • Jason August 17, 2015 at 11:35 am #

      Excellent, glad to hear it!

  2. Adrian Lee September 19, 2015 at 4:31 am #

    Great tool there Jason. I noticed that you’ve used the referral exclusion. Some are of the opinion that it’s not the best way to go about it and that campaign source would be a better fix. (Something about it being captured as direct traffic if you use referral) What’s your take on that?

    • Jason September 19, 2015 at 12:13 pm #

      Hi Adrian, I don’t use the referral exclusion list setting in GA, that is indeed a bad way to do it. The exclusion list just drops the referrer data, not the sessions themselves, so yes it would show up as direct traffic mostly in that case.

      I do have an exclude filter based on the referral field, which is different as that method excludes data from that session (ghost or otherwise) from being added to your reporting at all. Hope that clears things up, thanks for using the tool!

      • Adrian Lee September 20, 2015 at 5:12 am #

        Ah…my bad. Sorry. This is all so new to me. Trying to understand GA is bad enough. Having to deal with spam is another ball game altogether. Thanks for clearing things up. I noticed that some writers recommend the use of the campaign source field rather than the referral field. Is there any significant difference between the 2?

        • Jason September 20, 2015 at 1:27 pm #

          Referral exclusion is what you’d think you’d want, I made the same mistake at first.

          Using “campaign source” should also work and is in some ways better. The difference is how they get processed by GA; the document referrer will get turned into the campaign source in most cases (see this flowchart: https://support.google.com/analytics/answer/6205762#flowchart), but if the spammer is using the campaign source field directly then it would get around that filter.

          However since it’s the ghost spammers – not crawlers – that use that kind of direct spamming (using the measurement protocol) it is blocked by the hostname whitelist since they aren’t using your hostname. It could be some crawler spam is now getting around that, if so I need to switch over to campaign source myself!

  3. ARcaptures November 29, 2015 at 9:38 pm #

    You absolutely rock! So excited I found this. I’ll let you know how it ends up working for me.

    • Jason November 30, 2015 at 9:37 am #

      Thanks, hope it works well for you!

  4. whda November 26, 2017 at 10:49 pm #

    I have installed this. Now where to access quantable’s settings on google analytics?

    • Jason November 27, 2017 at 8:31 am #

      All this does is install filters, which are all under “Filters” in admin > view in GA.

Leave a Reply