Til Datatilsynet

To start page

Guide

The anonymisation of personal data

Anonymisation and personal data

When any person or organisation domiciled in Norway processes digital data, they must take into consideration that such processing may trigger obligations and rights under to the Norwegian Personal Data Act.

Anyone processing the data must comply with the Act's provisions, or risk incurring a financial penalty, civil liability or even criminal liability.

However, this presumes that the data being processed are personal data, since the Personal Data Act applies only to data that relate to specific individuals. The limits of what may be defined as personal data are therefore crucial to the law's applicability.

More simply put, the processing of personal data is covered by the Act, while the processing of anonymous or anonymised data is not. The same applies to data that is not linked to individuals at all. It is the distinction between personal data and anonymous data that is the topic for this guide.

Personal data and anonymous data

It can be tricky to decide where the line between personal data and anonymous data should be drawn. The starting point for any such assessment is the legislation's definition of the term "personal data".

Personal data comprise information and assessments that may be linked to an individual (natural) person (Section 2 of the Personal Data Act).
This definition has three main components:

  1. It can relate to any form of information.
  2. The phrase that may be linked to is the bridge between 1 and 3.
  3. An identifiable or identified natural person.

We believe a certain understanding of what is deemed to constitute personal data is necessary in order to understand anonymisation. For a more thorough analysis of the personal data concept, see section 4.2 of the report Big Data – principles of personal data under pressure (2013, pdf), available from datatilsynet.no.

Here follows a brief discussion of the three elements:

1. Any form of information

All types of information are encompassed by the definition. Firstly, it means objective information, such as a person's age, address or annual income. Secondly, it can include subjective impressions, such as a person's assessments or characterisation of another individual. The veracity of the information is unimportant. It is an item of personal data irrespective of whether it is an assertion, verifiable fact or pure invention.

Nor is the term "personal data" restricted to matters traditionally associated with an individual's private life. Other, more prosaic matters, such as where one works or what one is studying, also fall within the definition of personal data.

The question of how worthy of protection the information is only arises at a later point in time, often in connection with an assessment of whether the way in which the data are processed complies with the law or not.

Nor is the format in which the data are held of any significance. Personal data can be expressed verbally, numerically, in drawings, photos, sound or as biometric characteristics. Furthermore, the data may be found in emails, in public case documents, on social media, in apps, text messages, online, etc.

2. The linking element

It must be possible to link the information to a physical (natural) person. Sometimes, this link is easy to recognise, sometimes not. For example, information on the condition of a vehicle will probably be associated primarily with the object itself. Nevertheless, that same information could also reveal matters relating to people who have had to do with the object, such as the vehicle's owners. In certain circumstances, information on one person may, at the same time, constitute information about one or more others. This could be the case, for example, in a medical or genetic context.

Thus, the link between the information and the person may also be indirect. Such an indirect link is sufficient for the Act to be applicable. This follows directly from the wording of Section 2 of the Personal Data Act.

3. Identifiable natural person

The information must also be linked to an individual (natural) person, and that person must be identifiable. That a person has been identified means that he or she has been distinguished from a group of people. That the person is identifiable means that such identification is possible. That such identification could feasibly occur at some point in the future is sufficient.

Information may, at first glance, appear to be anonymous, but nevertheless constitute personal data in the eyes of the law. This is because it may be possible to identify one or more people indirectly. Examples include a vehicle's registration number or a smart phone's IMEI number. These data can, in certain cases, be linked to other data sets or other databases, thereby revealing the identity of the vehicle or phone's owner. Other information that appears together with such numbers, such as where the vehicle or phone has been, will therefore also be considered to be personal data.

Anonymous data

In the above, we have attempted to explain the meaning of the term "personal data". It is important to have a certain understanding of what personal data are in order to be able to determine what is required for an item of personal data to be deemed anonymised.
As previously mentioned, the Personal Data Act has no provisions with respect to the processing of anonymous or anonymised data, and the correct identification of where the line is drawn can therefore be of great significance.

Data of the type defined in point 1 above can be said to be anonymous when it is not possible to find any such linking element as stated in point 2, or the individual in point 3 is not identifiable.
Anonymous data can be defined as data that are impossible to link to an identifiable individual, taking account of all the means that may reasonably be envisaged used to identify the person concerned, either by the data controller or any other third party (See recital 26 of the EU Data Protection Directive's preamble.)

Definitions

Anonymisation is the act of rendering personal data anonymous.

Pseudonymisation is the replacement of directly identifiable parameters with pseudonyms, which will still constitute unique identifying indicators.

De-identification is the removal of all uniquely personal characteristics from the data, so that they can no longer be linked to a specific individual.

Anonymisation

Anonymisation is the act of rendering personal data anonymous. In other words, data sets that can be linked to an identifiable person are prepared in such a way as to make it impossible to link the data to a specific person. Several techniques can be used to achieve this aim. The various techniques' strengths and weaknesses are described in the appendix to this guide (see page 16).

When the anonymising process is finished, it is important to realise that true anonymisation has been achieved only if the process is irreversible.

In other words, it must not be possible to re-establish the link between the data and the specific individual, taking account of the means which may reasonably be envisaged used to identify the person concerned, as mentioned earlier.

Determining whether the data make it possible to identify a person or whether the data may be considered anonymous or not depends on the actual circumstances. The assessment must rest on the likelihood of re-identification. Each individual case must be assessed and analysed not only on the basis of the means available today, but also with an eye on tomorrow's technology – within reasonable limits, naturally. The benchmark is the extent to which such means can be envisaged used to discover the identities of the people concerned.

Sometime, anonymisation is confused with two similar phenomena, pseudonymisation and de-identification. Such confusion may be unfortunate. At worst, it could result in the commission of a criminal offence, with all the consequences that could entail.

Pseudonymisation

Pseudonymisation is the replacement of directly identifiable parameters with pseudonyms, which will still constitute unique identifying indicators. A likelihood therefore exists that the specific individual may be indirectly identified.

Indeed, it is often the point that the same (pseudonymised) person can be tracked over a certain period of time, in connection with research studies, for example. We therefore find ourselves within the scope of the Personal Data Act's definition of personal data, with the consequence that the Act's provisions must be respected.

In other words, it is extremely important to be aware of this distinction, since pseudonymised data are subject to the provisions of the Personal Data Act, while the opposite is the case with respect to anonymous data.

However, this does not mean that pseudonymisation is without merit. Pseudonymisation can make it more difficult to link a specific data set to the data subject's identity. It can therefore be seen as a useful technique for promoting privacy. Pseudonymisation may protect the individual to which the data are linked, and it may be easier to justify the processing of such pseudonymised data in relation to one or more of the lawful grounds provided in the Act.

The terms we use in this guide are based on shared European assessments (see, for example, the Article 29 Working Party's opinion/recommendations on anonymisation techniques (pdf). They may deviate from the way in which such terms are understood in Norway, particularly in the health sector. In the Personal Health Data Filing Systems Act, which came into force on 1 January 2015, the definitions of pseudonymised and de-identified health data were replaced by the broader term "indirectly identifiable health data". (Further information on the term pseudonymisation as it was understood prior to the new legislation, can be found in Circular I-8/2005 (regjeringen.no, pdf). This also applies to the legal sense of the term "encryption", which deviates from how the term is used in this guide, where it denotes a technique).

Advantages of anonymised data

If you have a sufficiently robust and securely anonymised data set, you can make use of the information without any risk of contravening the Personal Data Act. You do not need to take account of the duties applying to the data controller, and further use and analysis of this type of data is not subject to any notification or licensing requirement.

Nor do you need to make sure that there are lawful grounds for processing the data, or comply with requirements relating to relevance or purpose. Furthermore, the data holder has no obligation to delete the data, etc.

Anonymisation could be the solution in cases where there are doubts about whether the law permits personal data to be processed in a certain way. In cases where the law specifically precludes the processing of personal data, the answer could be to render the data anonymous, since anonymous data fall outside the scope of the Personal Data Act.

Anonymisation and the concept of data processing

It is also a prerequisite that the data to be rendered anonymous have been collected and processed in accordance with the Personal Data Act's provisions. In theory, the very act of anonymisation must be deemed to constitute the processing of personal data. In consequence, therefore, anyone undertaking the anonymisation of the data must respect the requirements set out in Section 11 of the Personal Data Act during the anonymisation process. (See section 2.2.1 in the Article 29 Working Party's opinion/recommendations on anonymisation techniques (pdf) for further details.) The restrictions relating to purpose set out in Section 11(1)(c) must, for example, be respected.
Grounds for anonymisation will probably often be found in the so-called balancing of interests stipulated in Section 8(f) of the Personal Data Act. The provision states that personal data may be processed only if the processing thereof enables the data controller, or third parties to whom the data are disclosed, to protect a legitimate interest, except where such interest is overridden by the interests of the data subject.

In other words, a legitimate interest must exist, and the Act's stipulation of necessity must have been met. A key issue, however, is that the data subject's privacy can be said to have been infringed to only a minor degree by the anonymisation of data that can be linked to him or her. This will naturally play an important role in the balancing of each party's interests.
If anonymisation is deemed to constitute the processing of data in the legal sense, it is clear that anonymisation cannot "repair" a lack of lawfulness or legitimacy in the original data collection. In other words, it is not possible to first collect personal data in contravention of the law and then render it anonymous.

Anonymisation in brief

  • Personal data are made up of information and assessments that can be linked to a specific person.
  • It can be difficult to draw the line between personal data and anonymous data. It is therefore important to have some understanding of what personal data are.
  •  Anonymous data fall outside the scope of the Personal Data Act. For this Act not to apply, it is crucial that the anonymisation of data is real. In other words, it must be impossible to recreate any link between the data and the individual concerned, taking into account the means that may reasonably be envisaged used.
  • The advantage of anonymisation is that the further processing of the data can take place without incurring any form of processing liability.
  • Anonymisation will not always be necessary. In many cases, the data will be processed in accordance with one or more of the lawful grounds provided in the Personal Data Act.