In the sandbox, Finterai and the Norwegian Data Protection Authority have explored data protection issues relating to the development of an anti-money laundering solution based on federated learning. This report is not an exhaustive discussion of the questions federated learning raises with respect to the data protection regulations, and the Norwegian Data Protection Authority would like to highlight Privacy by Design as an area for further deliberation.
Privacy by Design
This principle establishes that account shall be taken of the fundamental principles relating to the processing of personal data set out in Article 5 of the GDPR in all phases of the lifecycle of a software program that processes personal data, such that the data subjects’ rights and freedoms are upheld.
Data protection shall be integrated into the technology and included in the planning phase of the solution's development. The safeguarding of data privacy shall also be a natural part of the development process and not something that comes up once a technical solution is almost fully developed.
The Norwegian Data Protection Authority has prepared a comprehensive guide to software development with Privacy by Design, which can help companies to understand and comply with the data protection requirements.
Companies which take data protection seriously build trust. Article 25 of the GDPR requires that enterprises take account of the fundamental principles relating to the processing of personal data in all phases of the lifecycle of a software program that processes personal data, such that the data subjects’ rights and freedoms are upheld. Data protection shall be integrated into the technology, included in the planning phase of the solution's development and be the default setting. The safeguarding of data privacy shall also be a natural part of the development process and not something added on at the last minute.
The sandbox project has not had the capacity to explore in depth what Privacy by Design means in the context of machine learning based on federated learning principles, nor has any conclusion been reached as to whether Finterai meets the requirements in Article 25 of the GDPR. Finterai’s federated learning solution may be an inherently more privacy-friendly technology compared to more “traditional machine learning models”, because the method allows participants in the federated learning system to learn from each other's data without actually sharing data. It is precisely this built-in restriction on the further sharing of local data that makes the technology more privacy friendly.
Nevertheless, Finterai must meet the requirements in Article 25 of the GDPR to be relevant for customers who are obligated to choose solutions with Privacy by Design. It would be very useful to explore other technical and organisational initiatives which could effectively build in data protection during the development of the solution.
The Norwegian Data Protection Authority considers that the interface between the data protection regulations and the anti-money laundering regulations should be subject to further examination. At present, it is uncertain how the relationship between the data protection and anti-money laundering regulations affect which data enterprises may collect and use in their anti-money laundering endeavours.
Federated in other fields
Going forward, it will be relevant to monitor new areas of application for federated learning. The method is generally useful when:
- There are few examples of at least one class of data.
- Opportunities for data sharing are limited.
- Cooperation is necessary.
- There is little relevant data.
The battle waged by insurance companies against insurance fraud has many similarities with the banks’ anti-money laundering endeavours and is therefore an obvious field for the application of federated learning.
Solutions that learn from official register data may also be relevant for the method. In Norway, we gather vast amounts of information about private individuals in a variety of official registers. For example, Norway’s health data is considered to be among the best in the world (www.ehelse.no). This provides unique opportunities to develop accurate and effective solutions, as well as study connections in areas as varied as the high school drop-out rate, pension schemes and public health. This wealth of information also comes with privacy dilemmas, because those who process register-related data could potentially re-identify individual people. However, with federated learning, different entities can train the same algorithm on their internal datasets, while data from the original source are not shared.
In Norway, we need a better understanding of privacy-friendly technology. That companies like Finterai wish to take the lead, and openly explore their solution in the sandbox, helps to lower the risk threshold associated with developing new AI-based solutions and also provides experience of how such technology works in practice. We hope the sandbox project’s assessments will contribute to innovation through the secure sharing of data and make it easier for developers to comply with the GDPR's requirements.