Article

A first-principles approach to AI data protection

Karen Bollard, Practice Lead

Having attended several data protection conferences this year, I have been struck by two key concerns which have been repeatedly referenced and which I feel are worthy of closer scrutiny. The first of these concerns relates to the fact that privacy teams and regulators alike are feeling under enormous pressure as a result of the absolute tsunami of new legislation, either recently introduced or currently under consideration. The second concern relates to AI and its continued agenda-topping presence. When it comes to AI, top of every privacy professionals to-do list, is to understand how best to apply existing data protection legislation to the plethora of new AI tools being used and introduced in organisations. With so many spinning plates, it is easy to panic and get yourself tied up in knots about where to start when it comes to AI and data protection. In reflecting how best to manage these two challenges, I found myself revisiting the core data protection principles and offer the recommendations below as practical guidance for those struggling to swim against the tide.

When things get too complicated and the issues you are facing seem complex and intractable, the best thing to do is bring everything back to first principles.

So here are some simple questions you can bring to your developers or AI providers, along with some straightforward DO’s and DON’Ts and some key facts you need to consider when applying data protection principles to AI.

Lawfulness
Like any other system or application, AI apps must comply with applicable data protection laws, such as the GDPR in Europe or POPIA in South Africa.

DO identify your lawful basis for processing.
DON’T assume that your legal basis remains the same as it was prior to the introduction of AI. Your organisation has likely chosen to use AI in place of other tools or processes – please note that the decision to do so was likely made in the legitimate interests of the business rather than for any other reason.
DO conduct an LIA (Legitimate Interests Assessment) balancing the business benefits derived from the use of AI against the rights and freedoms of the individuals whose data is being processed.

Fairness
AI systems can inadvertently perpetuate biases present in the training data, leading to unfair treatment of particular groups. Processing should not exploit or discriminate against individuals, and so:

The AI should be designed to avoid biases and ensure equitable treatment.
DO ask your developers and/or the providers of the system:
- To explain how the model is trained.
- How the system has been tested for bias, hallucinations, and other common AI flaws.
DON’T be afraid to do some testing yourself to give you a level of confidence in the results.

Transparency
AI models, especially those based on machine learning and deep learning, often operate as "black boxes," making it difficult to understand how decisions are made. Ensuring transparency and explainability in these systems is a challenge, but not an insurmountable one.

Clear and accessible privacy notices are crucial.
DON’T write the privacy notice, until you feel you can explain what should be in it to your mother or your 10-year-old nephew.
DO ask yourself if you can describe in easily understood wording what the AI is doing?
DO go back to your developers and/or AI provider and ask them to explain what is going on under the hood.
DON’T accept ‘jargon’ in lieu of an explanation.
DO feel free to keep asking for clarification until it is clear to you what’s going on. If the AI company will not make the effort to explain to you, their customer, how their AI works, how do you think they’ll treat a data subject rights request later? If they are not helpful in this regard, It may be time to consider another supplier.

Purpose Limitation
Data collected for processing by an AI should be used only for specified, explicit, and legitimate purposes and should not repurpose data in ways that could harm individuals.

DON’T assume that the AI is doing nothing else with the data collected.
DO check the following with your developers/AI providers. These questions can help identify any additional processing:
- Is the data is being kept locally in your own systems and network?
- Is the data is being sent back to them for other uses and purposes – for example being used to further train the model.
- Is the data being shared with anyone else? If so, why?
DO examine the added benefits you’ve been promised. These will highlight some additional processing that is being conducted. For example:
- Were you promised reports or summaries from the tool (this would suggest at least some additional processing)?
- Were you promised more efficiency by only targeting specific people or audiences (this would suggest a level of profiling)?
DON’T panic if you do uncover additional processing – some additional processes may be of concern, but many are not. Just be sure to include any additional (non-harmful) processing in your Privacy Notice and RoPA (Records of Processing Activities).

Data Minimisation
AI systems typically need vast amounts of data to function effectively this can conflict with the principle of data minimisation.

Balancing the need for data collection with the requirement to minimise data collection is a complex but achievable task.

Like any processing, AI systems should only process the minimum amount of personal data necessary to achieve their objectives.

DON’T assume that every data element requested is necessary to achieve your purpose.
DO ask yourself (and go back to your supplier if you can’t answer):
- Do I really need to use X?
- Why am I being asked to provide Y?
- What purpose does processing Z meet?

Accuracy
Maintaining the accuracy of data used by AI systems is important. Inaccurate data can lead to incorrect decisions and outcomes, which can harm individuals and undermine trust in your company.

DON’T assume that the data generated by the AI application is accurate. AIs are not search engines.
DO put business processes or policies in place ensuring that a person reviews the output of the AI application particularly before the results are further processed. These don’t need to be complicated procedures – just asking somebody to review the meeting minutes generated by an AI assistant to ensure they are accurate before you share them is remarkably effective.
DO make sure you know the difference between distinct types of AI. Generative AI, will do just that and generate (make up) responses if it can’t find the answer.
DO check with your developers/AI providers to find out whether or not the AI is generative in nature.

Storage Limitation
Personal data should not be kept for longer than necessary. AI systems, like any others you implement, must have mechanisms to delete or irreversibly anonymise data once it is no longer needed for the purposes for which it was collected. AI is not so complicated that you can’t determine suitable retention plans.

DO ask yourself the same straightforward questions you would for any data processing:
- Do you need to keep everything, forever?
- Can you limit or minimise what you (or your vendor) keep? For example, can you just keep the anonymised summary without the underlying data?
- Do multiple copies of the data exist (one with the AI vendor and one within your own network)? If so, do you need them
DO ensure the Data Protection Agreement (DPA) or Data Sharing Agreement (DSA) with the AI supplier includes relevant retention periods.

Integrity and Confidentiality
Like any other system utilised in your organisations, AI systems must ensure the security of personal data. This includes protecting data against unauthorised access, accidental loss, destruction, or damage.

DO make sure you implement your own standard security measures as a minimum, such as encryption and access controls.
DON’T assume that your AI supplier has done so. Instead:
- Ensure that relevant security measures are included in your DPA/DSA with your AI provider.
- Ask your supplier about their backup procedures and Business Continuity Planning (BCP).
- Ask your supplier to confirm how they will guarantee that personal data remains anonymous and will never be made public.

Accountability
AI is not so different from any other processing, in that, organisations using it must be accountable for their data processing activities.

DON’T panic.
DO implement your standard technical and organisational measures.
DO evidence your work appropriately by conducting all your usual risk assessments, checks, and balances (RoPA, TIA, LIA etc.).
DO use your regular checklists to determine if a DPIA is required.

Data Subject Rights
I appreciate that this is an article about first principles, but I would be remiss if I also didn’t bring one other core Data Protection concern to your attention that is of particular importance in relation to AI. It is enshrined into the GDPR in Article 22 and other relevant DP laws – the right of an individual to object to processing carried out by fully automated means, particularly if there is profiling/matching taking place.

DO carry out a DPIA where processing is large scale, involving profiling/matching, or where automated decisions could adversely affect individuals.
DO ensure you make it possible for individuals to request a review by a human being (which should not just be a repeat of the exercise carried out by the AI)
DO ensure that you have a procedure in place to allow individuals to object to the processing.
DON’T forget to include this information in your privacy notices.

Overall, yes, you’ll have to get up to speed with new AI legislation and perhaps implement some new controls and organisation measures like an FRIA (Fundamental Rights Impact Assessment) but that doesn’t mean that you do nothing in the interim to protect the interests of individuals having their data processed by AI.

Simply bring everything back to first principles and keep doing what you already do!

Act now and speak to us about your privacy requirements

Start a conversation about how Privacy Made Practical® can benefit your business.

Click here to contact us.