Top tips for safely analysing sensitive personal data
Author: Diego Arenas;
Reading Time: 6 minutes
We've made this resource open. You are free to copy and adapt it. Read the terms.
If you’re working with people you’re likely to be capturing their personal data. This type of data holds insights into them and your impact on their lives. Analysing it for insights can help improve your services but this also brings risks. This resource explores those risks and gives recommendations on overcoming them.
This resource is for:
- Service managers running services that capture people’s personal data
- Data leads and other people who are analysing that data to gain insights into their organisation’s work.
It covers:
- What sensitive personal data is
- Whether it’s necessary to analyse data
- Running a data protection impact assessment
- Informing people of how you’ll use their data
- Anonymising data
- Why you should go slowly
- Supporting your analysts’ wellbeing
- How one charity analysed their data.
About sensitive personal data
Sensitive personal data includes:
- people’s personal details
- records of conversations from meetings, phone calls, emails and even web-chats and other online communication methods.
This data often comes in text form and contains personal information and insights into people’s needs and your impact.
Records of conversations are likely to contain sensitive, private, or upsetting material. Qualitative data like text can have a lot of personally identifiable information (PII). This should all be approached with caution, especially if the people the data is about are vulnerable individuals.
Ask: is it necessary to analyse this data?
Consider whether analysing data that could identify your users is necessary.
Do not analyse any of this data unless you need it to either:
- Improve your service
- Improve the lives of the community your organisation serves.
When processing any information, you must always have a lawful basis for using it . The Information Commissioner’s Office (ICO) provides guidance.
Run a data protection impact assessment
Running a data protection impact assessment will help you identify specific risks within your data.
List the likelihood and severity of any potential harm for each risk. Then list the actions you’ve taken (or will take) to mitigate these risks. Then zoom out and reflect on the overall impact of your actions.
Make your assessment a transparent and meaningful process . People working with sensitive data need to be aware that behind each data point, there is or was a life. This will help you consider the consequences of the analysis. How would you feel if the data you are analysing were about you? How much care would you put into the analysis and accuracy of the results?
Inform people of how you’ll use their data
Implement a process of informed consent. This helps people understand what you intend to do with their data and why. The process should let them choose to opt in or out without losing access to your services.
Explain the steps you’ll take to protect their privacy, and the end purpose of analysing the data. Ensure your privacy policy is accessible and aligns with your organisation’s values.
If you’re not sure what people would be happy with, ask them.
Anonymise before analysing
Removing identifying information usually doesn’t reduce the value or scope of the analysis. Often it’s also a legal, if not ethical, imperative. Here are our top tips:
- Mask names. Replace names with aliases, or the participant’s role. For example, if you have professionals giving advice, you can replace their names with ‘Professional’ or ‘Agent’ so that it is clear the information comes from them, but they are not identifiable
- Remove personal identifiable information. This includes addresses, email addresses, phone numbers, and any other PII that could be used to identify an individual
- Use geographic area. Replace addresses with a geographic area, so it is possible to summarise statistics without identifying individuals. Ensure this area is large enough that someone’s identity cannot be inferred.
- Look for identifiable combinations of data. Consider whether there is other personal data that, when combined, could lead to identification of an individual — for example their age, ethnicity, and school combined
- Be aware of metadata. Handle carefully the metadata that describes the data, for example user IDs, timestamps etc.
Go slowly
Implementing these recommendations and setting up good processes takes time. It’s better to do it right than in a rush. Before analysing data take your time to clean and anonymise it. If you automate this (for example, through ‘find and replace’ features or using anonymisation software) you should still manually check the outputs to ensure the automation is doing what you want it to.
Support your analysts’ wellbeing
Sensitive data can affect not only your users, but also those performing the analysis. Notify people about the content of the documents they are about to see. Give them the option to not receive the most sensitive data. You can share a sample of the data with them so that they get an idea of what they will encounter.
Action for Children’s approach to analysing data
Action for Children analyse webchat data from their Parent Talk service. Doing this helps them understand their users’ needs.
DataKind UK helped them ensure the data was anonymised and safe for volunteer data analysts to look at. Together they created a risk register and ran a data protection impact assessment. They ensured no personal information was seen by anyone outside of their service team.
This preparation was worthwhile. When they analysed their users’ conversations for common keywords they learnt:
- which areas of their service needed more resources
- how they might make these resources more intuitive to find on their website.
They also examined changes in the type of advice sought during the pandemic. This included:
- A peak in conversations about behavioural management, sleep, and living arrangements early in lockdown
- A rise in conversations around education and Special Educational Needs and Disabilities, presumably because children returned to school.
- A big increase in mental health and SEND issues throughout every stage of the pandemic.
Their analysis, including feedback from their webchat service, helped Action for Children make the case for increasing their capacity, and apply for funding to do so.
They used what they learnt from working with Datakind UK to build a job description and hire a data analyst. This will help them improve their impact reporting further.
Resources to help you use data responsibly
- ICO guidance on running data protection impact assessments
- Consent issues in data sharing – a free online workshop that explains the difference between types of consent
- Identifying potential data risks and harms to help you identify risks and act to reduce them
- Ethics and safeguarding training pack from The Data for Children Collaborative.
- Self-care in user research for supporting analysts
- The UK Data Service – access and training to use the UK’s largest collection of economic, social and population data for research and teaching
- DigiSafe – a step-by-step digital safeguarding guide for charities designing new services or taking existing ones online
Read how Now Foster safely analyse data gathered when helping people get ready to use their service.
—
Photo by Ed Hardie on Unsplash
Commissioned by Catalyst