Knowledge Base

How To Filter Out Language Spam in Google Analytics

Google Analytics spam is constantly spreading and growing to new places. You have probably seen suspicious referral traffic or languages with messages to those who work in Analytics. These links are language spam click-bait for you to visit potentially harmful websites with malware or scams. The links are not actual traffic to your site.

To avoid analyzing skewed data, use filters to prevent the spam from coming through.

To start, go to your Admin tab and then to View Settings for the account and make sure you have Bot Filtering selected.

Bot filtering

Google Analytics will then ‘Exclude all hits from known bots and spiders’. This is just the first step in blocking spam traffic because it only applies to known bots, not the new and emerging spam that occurs daily.

Referral Language Filter

Referral Language Filter

A solution to spam in the language dimension is to create an exclude filter in Language Settings for lengthy languages. This solution is valid because typically true languages are not more than 10 characters nor do they contain symbols.

IMPORTANT: In the Admin tab, you’ll want to create a new view or apply these filters to an already filtered view. This way you will preserve at least one master account view that is unfiltered. Having an unfiltered view is important because you cannot undo filtered data. Create a new view in Google Analytics by clicking on Admin. Make sure you have selected the right account and property from the drop down menus. Once you have completed this step, click on view, select create new view, and give this view a name.

Admin bar

To apply a filter for language length, go to the view that you want to filter under the Admin tab. Once in the view, select ‘Filters’ and ‘+Add Filter’.

Filter information

Choose a ‘Custom’ filter type and then ‘Exclude’. From the Filter Field, select Language Settings. In the filter pattern, enter in this expression: \s[^\s]*\s|.{15,}|\.|, |\!|\/

The Filter Pattern is in regular expression and excludes languages over 15 characters in length with symbols such as the Secret.Google.com spam.

Then, you may want to click ‘Verify this filter’ to make sure the changes will apply as intended. Filters can take time to apply to the data, but the Verifying preview shows real time results. Verifying is also a good idea because filtered data cannot be unfiltered.

If the Verify link comes back with a message that says “This filter would not have changed your data”, you might be looking at too short of a range of data. Try ‘Verifying with a larger set of data’.

Then save and say goodbye to those lengthy spam languages!

Reader Interactions

Leave a Reply

Your email address will not be published. Required fields are marked *