Every time I get the feedback box from Google Analytics, I write in all caps to beg them to block spam on their side. Analytics Spam is annoying because it inflates your traffic. It would have been nice if you could block the IP address of all the machines that trigger it but it is not that simple.
Google Analytics requires JavaScript to run. Most bot don't even run JavaScript. So how are they making spam requests that is captured by Google Analytics data? Well all they need is your user account number, which is publicly available on your page.
With this UA-xxxxx-x
number, they don't even need to run the JavaScript to spam you. On this website, the worst offense is that they set their own referral URL and hope that I will access those pages. Most of the time, it redirects to aliexpress. This could be a marketing campaign by Aliexpress, or a competitor trying to kill their business. Either way, it is marketed to those checking Google Analytics — marketers, data analyst and so on.
There are a few ways I have tried to fix the problem over the years. The first was time consuming and required me to constantly monitor if there are new referrals to block. The other seems to be more promising, I haven't gotten spam since I made this change.
Admin filters
By creating filters, we can remove all the unwanted data from our reports. They do take time to update if you are setting them for the first time but eventually you have clean data.
Be sure to create a new profile for this. I made sure to use a new one besides All Website Data
, because if you filter out the wrong data, there is no way of getting it back.
This method works. However, as you can see on the image above, I have 3 filters just for bot spam. There is a limit on the number of characters you can put in the regex box (around 250). Because of this, I have to create a new filter every time there is a new domain spam to add to the list and no space left.
You have to constantly monitor your referrals and add new entries. It works, but it can become tedious. One advantage we have is that bots are usually not very smart. They look for specific things on the page and if they can't find it, they move on. Hence the next method.
Hiding your UA-Number
Simply using JavaScript to generate your emails on a page, prevents the majority of bots from discovering them. We can apply the same thing to our UA-XXXX-XX number to confuse bots.
The bots that spread the spam don't necessarily run JavaScript, instead, they come to a page, fetch the UA number, then make a spam request on their own.
I assume that bots run a regex on the page to match anything that falls in the line of UA-\d+-\d+
. If we scramble this data, they will fail. So I went from this:
var _gaq = _gaq || [];
_gaq.push(['_setAccount', "UA-XXXXXXXX-X"]);
To this:
var _gaq = _gaq || [],
ua2 = "XXXXXXXX",
ua3 = "1",
ua1 = "UA";
_gaq.push(['_setAccount', ua1+"-"+ua2+"-"+ua3]);
_gaq.push(['_trackPageview']);
This way, the bot accessing the page will have to be a little more creative to get my UA number. In this example, I assume that they don't save the UA number. If they do then they don't need to come to our website at all. But still this is good enough protection to prevent new ones from spamming you.
Feel free to use your own method to scramble this number to prevent them for getting it while still working properly for your own needs.
Comments
There are no comments added yet.
Let's hear your thoughts