In the last few weeks I have performed several in-depth GA audits and was blown away by the number of times PII was stored in Google Analytics. This post will uncover a ton of things about PII and Google Analytics.
Sending PII to Google Analytics is one of the worst things you can do.
Following the Google Analytics Terms of Service:
“You will not and will not assist or permit any third party to, pass information to Google that Google could use or recognize as personally identifiable information.”
And to further elaborate on that:
“The Analytics terms of service, which all Analytics customers must adhere to, prohibits sending personally identifiable information (PII) to Analytics (such as names, social security numbers, email addresses, or any similar data), or data that permanently identifies a particular device (such as a mobile phone’s unique device identifier if such an identifier cannot be reset). Your Analytics account could be terminated and your data destroyed if you use any of this information.”
Ok, this is something to take very seriously.
By reading this post you will learn about best practices in dealing with Personally Identifiable Information in Google Analytics.
Table of Contents
- PII Check in Google Analytics
- Simple Tip to Avoid Sending PII
- How to Avoid Storing PII
- How to Deal with PII in Your Account
Looking for PII during the setup and testing phase of your Google Analytics implementation is recommended in order to avoid running into any PII collection issues later on.
In the testing phase you can still delete a reporting view without real negative consequences.
PII Check in Google Analytics
Simpling checking your content reports to see whether a query parameter contains an email address is far from enough.
Unfortunately PII can show up in many more places in Google Analytics. And most often, it is unintentionally.
Here is a list of places you should check at a minimum:
1. Query String Parameters
Navigate to Behavior >> All Pages >> Site Content.
Do a search on “\?” to find out more about the active query parameters in your account.
2. Data Import
The data import functionality can be extremely powerful, but make sure to not send any PII to Google while importing data.
Use this Google Analytics PII viewer if you want to map data (e.g. the User ID) stored in Google Analytics to PII such as name and email address stored locally.
3. Event Dimensions
Another important feature in Google Analytics is event tracking.
Within a minute you will see whether there is any PII stored in “Event Category”.
Further you can easily switch the primary dimension to “Event Action” or “Event Label” to check whether any PII is stored in Google Analytics.
4. Custom Dimensions
Custom dimensions are powerful, but can be risky as well. Google allows you to pass additional information in GA (user, session, hit or product scope dimension).
You can quickly retrieve all (active) custom dimensions in the admin interface:
Let’s assume you want to check the “Sales Region” values.
I have quickly set up a custom report on one primary dimension (Default Channel Grouping) and a filter on “Affiliates”.
Since we already know the name of the “custom dimension”, it is easy to filter on this data:
A good understanding of the Google Analytics API (to automatically export this data) and regular expressions (to set up filters or segments) can help with performing a deep and quick analysis. However, a partly manual scan can most often not be avoided.
5. Campaign Parameters
Be sure not to include PII in campaign parameters.
- Campaign dimensions: Source, Medium, Keyword, Campaign, Content
- Campaign parameters: utm_source, utm_medium, utm_term, utm_campaign, and utm_content.
It depends on how your campaign tracking is configured, but running an automated check could save a lot of time in many cases.
6. Site Search Dimensions
Most companies have a Site Search functionality on their website.
You don’t want to have PII captured in either the site search term or site search category dimension.
And yes, controlling the “site search term” field can be rather difficult.
Simple Tip to Avoid Sending PII
Sending PII to Google Analytics goes often wrong with (sign up) forms.
Last month I audited an account where the email address value was passed in a query string after a newsletter sign up.
This is most often the difference between implementing a GET vs POST request on form submits.
Talk to your web developer if this is the case and make sure to get it solved properly!
How to Avoid PII Issues
You might think about applying query parameters in the view settings or using filters to get rid of this PII data, but unfortunately this is not enough to fix the issue.
Here is what Caleb Whitmore has to say about this topic:
You need to work with your developers to ensure that PII is not being sent to the GA servers in the first place.
Leveraging GTM can help you to stay on the safe side. Brian Clifton wrote a great article on this topic that you might want to check out!
How to Deal with PII in Your Account
Disclaimer: I am not a lawyer, and this part of the blog post does not constitute legal advice. I recommend seeking advice from legal counsel to confirm the appropriate policies and steps for your organization.
Take the following steps if you have found some form of PII in your account.
- Work with your developers to immediately stop collecting PII (simply filtering out PII in the Google Analytics interface is only half of the job as Google requires that you stop sending any PII to their servers).
- Backup your data or migrate your data into Google BigQuery (this service doesn’t have any PII limitations).
- Create new views (copy views that contain PII) so that you start collecting PII free data.
- Contact Google Support and inform them that your web property has been collecting PII. Google Support is much more likely to take certain measures if they find it out themselves. Now you also have the option to move your “corrupted” property to a different account.
Well, this is it from my side. Hope you can keep your account PII free! What is your experience with PII and Google Analytics?
One last thing... Make sure to get my automated Google Analytics 4 Audit Tool. It contains 30 key health checks on the GA4 Setup.
Gerry White says
You may find that just SENDING them to Google is bad enough, so even though it is filtered out of views it is still being sent, from a pure SEO point of view, tracking everything is often beneficial.
Google recently seem to be manually destroying PII information at their end without deleting accounts, GA accs I have seen which have previously been recording it have been objuscating it themselves which is a change in direction.
I have always told my clients to not pass any PII information via GTM filters which is a little more challenging but more effective. Would love to know if I am wrong as its a much harder approach!
Paul Koks says
Thank you for your comment Gerry.
Yes, the very best thing to do is to make sure that no PII data is sent to Google Analytics in the first place. However, filtering it out of the views is definitely smart to do as well (if for whatever reason you can’t prevent it being sent).
And also true that Google is (manually) destroying PII information. But I am not 100% sure that will work in all cases since I have seen plenty of accounts that still contain this data. I hope that PII issues won’t occur anymore in the future. As long as it is not 100% done and confirmed, I still recommend companies to work on it by themselves.
Your solution – not pass any PII information via GTM filters – might work as well. But I have to admit I am not so familiar with that setup.
Juan says
Thank you for the information, Paul. The past Dec 27th I found an PII issue and I warned my programmer about it. He fixed it using the option 2 (excluding URL parameter in view setting). However, I wonder if we could have some problem because the parameter was registing between Dec 23th and Dec 27th when the change was setted
Paul Koks says
Hi Juan,
Thank you for your comment. Your account is always at risk if there is PII information stored in the past. Unfortunately you cannot completely remove it yourself once the damage has been done.
Please see the last paragraph in this article and decide whether you take an additional step (informing Google) or not. This is hard for me to advice on as it touches the legal side. Of course you can choose to not backup your data (move your current property and start a new one), but the risk is always on your side if there is PII data in your account.
Best,
Paul
Haresh Pansuriya says
Great one..
Ben Young says
Hey Paul,
Great article. I’ve got 2 questions, if someone types in an address in your search bar and it’s recorded in your search terms, is this a breach?
2. What about a form that generates a dynamic URL parameter containing the property address that was submitted?
Thanks!
Paul Koks says
Hi Ben,
Thanks for your comment.
1) If you would be very strict, that is not allowed. Please review: https://support.google.com/analytics/answer/6366371?hl=en.
However, setting this up might be a daunting task.
2) If Google would judge it as PII, you should refrain from dynamically passing it via your URL. Maybe you can store it in your form program/environment instead?
In general, be as careful as possible with PII because it can get you into real trouble!
Best,
Paul
Naveed says
Hi,
I did on search for Query String Parameters in my GA account and it returned me 68 records. Just wondering what should we do in this case.
Apart from it I am capturing Client ID and UserID as per the best practices. Is that of any harm to PII.
Paul Koks says
Hi Naveed,
Thanks for your comment.
Question 1: do the query string parameters contain any PII information?
Question 2: if you use best practices and obey the privacy rules, you should be fine.
Best,
Paul
Isaac says
Hi Paul,
Is it possible to remove the PII with the Data Deletion Protocol?
Paul Koks says
Hi Isaac,
It might be possible (as you can delete the data from the servers), but still I always recommend to try to prevent PII to reach the servers in the first place.
Definitely good option to try!
Best,
Paul