The concepts of Privacy by Design and Privacy by Default, outlined in Article 25 of the GDPR are crucial aspects of GDPR compliance for technology developers. The requirements for implementing these concepts are quite extensive. As Art. 25.1 states,
Taking into account the state of the art, the cost of implementation and the nature, scope, context and purposes of processing as well as the risks of varying likelihood and severity for rights and freedoms of natural persons posed by the processing, the controller shall, both at the time of the determination of the means for processing and at the time of the processing itself, implement appropriate technical and organisational measures, such as pseudonymisation, which are designed to implement data-protection principles, such as data minimisation, in an effective manner and to integrate the necessary safeguards into the processing in order to meet the requirements of this Regulation and protect the rights of data subjects.
Essentially, data controllers need to consider data protection throughout the core of their organisational activities. As such, those who work to create technologies involved in data processing must consider the implications of their software in the context of the GDPR. While Data Protection by Design and Data Protection by Default are separate concepts, they are complementary. Implementing Data Protection by Design makes achieving Data Protection by Default much easier, with the reverse being true as well.
Building privacy into the heart of data processing operations and systems is part of Privacy by Design, while ensuring that the data subject’s rights are protected as a matter of standard operations is part of Privacy by Default. These concepts have been in existence since long before the GDPR came into fruition, but under the GDPR became important requirements.
Achieving Privacy by Design and Privacy by Default is not a simple process when one’s main focus is developing and delivering products. As such, familiarity is of the essence.
What are the most important considerations involved with these concepts, and how may data processors implement them?
What is Privacy by Design?
The concept of Privacy by Design was created by Ann Cavoukian in the 1990s and presented in her 2009 “Privacy by Design: The Definitive Workshop.” As Cavoukian stated, the concept of privacy by design encompasses more than just technology. Rather, Privacy by Design dictates that privacy is taken into account throughout the design process and operations of broader organisations and systems. There are seven foundational principles which constitute the basis of Privacy by Design:
- Measures are proactive rather than reactive. They anticipate risks and try to prevent them from occurring, rather than allowing for invasions of privacy and minimising them after the fact. These measures are woven into the culture of an organisation.
- Privacy is protected by default. Personal data is protected without requiring the data subject to act. In practice, the most intrusive privacy features of an app, such as geolocation tracking when that is not called for by the user, are turned off when the product is first installed or better yet, every time the app is launched.
- Privacy is embedded into the design of systems and organisations. It is not an afterthought, but an essential part of a system’s functionality. Designing for privacy can be quite costly so planning for it rather than redesigning to accommodate it, is a wise cost management strategy.
- Privacy is not implemented to the detriment of other interests, but rather to accommodate all legitimate interests with full functionality.
- Privacy is extended throughout the lifecycle of all the data collected.
- Data processing activities are visible and transparent. The business practices and technologies involved are clear to both users and providers.
- Measures for privacy are user-centric: the interests of data subjects are at the forefront of operations.
Cavoukian stresses that ensuring privacy does not come at the cost of other critical interests, but rather ought to complement other organisational goals.
But how does a team implement these foundational principles into their technological design?
Methods of Implementing and Measuring Data Protection by Design for Technology Developers
The European Data Protection Board adopted guidelines for Data Protection by Design and by Default on 20 October 2020. These guidelines clarify how to implement the requirements of Article 25 in organisations that process personal data.
Certain concepts, such as pseudonymisation, noise addition, substitution, K-anonymity, L-Diversity, T-closeness, and differential privacy, can help increase the privacy of an individual data subject, or give key information about the privacy of a data set. As a result, individuals working to achieve Privacy by Design should think about these methods as tools they can use, though not as absolute methods in and of themselves.
- Pseudonymisation replaces direct identifiers, such as names, with codes or numbers, which allows data to be linked to an individual without the individual themself being identified. This data is still within the scope of the GDPR. Truly anonymous data is not considered personal data, and thus its processing does not fall under the scope of the GDPR. However, anonymous data, that is, data which cannot be linked back to a data subject, is different from pseudo-anonymous data in that pseudo-anonymous data has the potential to be re-linked to a data subject, even if in a difficult or indirect way. Thus, pseudo-anonymous data is still subject to the requirements of the GDPR.
- Noise addition is often used in conjunction with other anonymisation techniques. In this technique, attributes which are both confidential and quantitative are added to or multiplied by a randomised number. The addition of noise still allows for the singling out of an individual’s data, even if the individual themself is not identifiable. It also allows for the records of one individual to be linked, even if the records are less reliable. This linkage can potentially link an individual to an artificially added piece of information.
- Substitution functions as another method of pseudonymisation. This is where a piece of data is substituted with a different value. Like the addition of noise, substitution ought to be used in conjunction with other data protection measure in order to ensure the data subjects’ rights are protected.
Means of measuring the privacy of data
- K-anonymity, a type of aggregation, is a concept that is based around combining datasets with similar attributes such that the identifying information about an individual is obscured. This helps to determine the degree of anonymity of a data set. Essentially, individual information is lumped in with a larger group, thereby hiding the identity of the individual. For example, an individual age could be replaced with an age range, which is called generalisation. By replacing specificity with generality, identifying information is harder to obtain. Suppression is another method of achieving better k-anonymity. This is where a certain category of data is removed from the data set entirely. This is best-suited in cases where the data in that category would be irrelevant in regards to the purpose of the data processing. It is important to note, however, that k-anonymity itself does not guarantee that sensitive data will be protected.
- L-diversity is an extension of k-anonymity. It provides a way of measuring the diversity of sensitive values in a dataset. Essentially, l-diversity requires each of the values of sensitive attributes within each group to be well-represented. In doing so, l-diversity helps to guarantee that a data set will be better protected against re-identification attacks. This is a helpful consideration in cases where it is possible for attributes in k-anonymised data sets to be linked back to an individual.
- T-closeness expands on l-diversity and is a strategy of anonymisation by generalisation. T-closeness creates equivalent classes which are similar to the initial distribution of attributes in a data set and is beneficial in situations where a data set must be kept as close as possible to its original form. Like k-anonymity and l-diversity, t-closeness helps to ensure that an individual cannot be singled out in a database. Additionally, these three methods still allow for linkability. What l-diversity and t-closeness do which k-anonymity cannot, is provide the guarantee that inference attacks against the data set will not have 100% confidence.
- Differential privacy aims to ensure the privacy rights of an individual data subject are protected by ensuring the information someone obtains from the output of data analysis is the same with or without the presence of the data of an individual. This allows for data processing without an individual’s information being singled out or the individual being identified. Differential privacy provides privacy through a specific type of randomisation. The data controller adds noise to the data set, with differential privacy revealing how much noise to add.
Privacy Design Strategies
Researchers have identified eight privacy design strategies, divided into two groups: data-oriented strategies and process-oriented strategies. Data-oriented strategies include: minimise, hide, separate, and abstract. These strategies focus on how to process data in a privacy-friendly manner. Process-oriented strategies include: inform, control, enforce, and demonstrate. These strategies focus on how an organisation can responsibly manage personal data. Article 5 of the GDPR identifies the basic principles to follow when processing personal data: lawfulness, fairness and transparency, purpose limitation, data minimisation, accuracy, storage limitation, integrity and confidentiality. These principles help guide the strategies, which can be exemplified by the concepts and methods of pseudonymisation, noise addition, substitution, k-anonymity, l-diversity, t-closeness, and differential privacy. These methods and processes of measuring privacy should stand as part of larger efforts to work to implement data protection into the fabric of data processing operations.
How can technology developers learn more about Privacy by Design and Default?
Data Protection by Design and Data Protection by Default are fundamental concepts to adhere to under the GDPR. Teams which keep these concepts in mind at every level of their organisations will keep the rights of data subjects at the forefront of their operations, and thus go further in working towards GDPR compliance. Technology developers have a special role in making sure that their products have the capacity to be used in a GDPR compliant manner, and thus should have extensive familiarity with these concepts. Those interested in learning more about GDPR compliance, from the perspective of what a technology developer should consider, can participate in TechGDPR’s Privacy & GDPR Compliance Course for Developers. This course delves into what individuals working in technology development need to know about data protection so they can better understand their own duties and responsibilities under the requirements of the GDPR.