General Data Protection Regulation reaches its first birthday
This blogpost arrives as the General Data Protection Regulation (GDPR) reaches its first birthday, and a week after a report from the Washington-based Center for Data Innovation (CDI) suggested amendments to the GDPR.
The report argues that regulatory relaxations would help foster Europe’s ‘Algorithmic Economy,’ purporting that GDPR’s restrictions of data sharing herald setbacks for European competitiveness in the AI technology field.
Citing the European Commission’s ambition “for Europe to become the world-leading region for developing and deploying cutting-edge, ethical and secure AI,” the report then proceeds to its central claim that “the GDPR, in its current form, puts Europe’s future competitiveness at risk.”
That being said, the report notes with approval France’s pro-AI strategy within the GDPR framework, in particular the country’s use of the clause that “grants them the authority to repurpose and share personal data in sectors that are strategic to the public interest—including health care, defense, the environment, and transport.”
Research is still being conducted into the legal and ethical dimensions of AI and the potential ramifications of automated decision-making on data subjects. In the UK, the ICO and the government’s recent advisory board, the Centre for Data Ethics and Innovation (CDEI – not to be confused with aforementioned CDI), are opening discussions and conducting call-outs for evidence regarding individuals’ or organisations’ experiences with AI. There are of course responsible ways of using AI, and organisations hoping to make the best of this new technology have the opportunity to shape the future of Europe’s innovative, but ethical use of data.
The Information Commissioner’s Office (ICO) research fellows and technology policy advisors release short brief for combatting Artificial Intelligence (AI) security risks
The ICO’s relatively young AI auditing framework blog (set up in March this year) discusses security risks in their latest post, using comparative examples of data protection threats between traditional technology and AI systems. The post focuses “on the way AI can adversely affect security by making known risks worse and more challenging to control.”
From a data protection perspective, the main issue with AI is its complexity and the volume of not only data, but externalities or ‘external dependencies’ that AI requires to function, particularly in the AI subfield Machine Learning (ML). Externalities take the form of third-party or open-source software used for building ML systems, or third-party consultants or suppliers who use their own or a partially externally dependent ML system. The ML systems themselves may have over a hundred external dependencies, including code libraries, software or even hardware, and their effectiveness will be determined by their interaction with multiple data sets from a huge variety of sources.
Organisations will not have contracts with these third parties, making data flows throughout the supply chain difficult to track and data security hard to keep on top of. AI developers come from a wide array of backgrounds, and there is no unified or coherent policy for data protection within the AI engineering community.
The ICO’s AI auditing blog uses the example of an organisation hiring a recruitment company who use Machine Learning to match candidate CVs to job vacancies. A certain amount of personal data would have been transferred between the organisation and the recruitment agency using manual methods. However, additional steps in the ML system will mean that data will be stored and transferred in different formats across different servers and systems. They conclude, “for both the recruitment firm and employers, this will increase the risk of a data breach, including unauthorised processing, loss, destruction and damage.”
For example, they write:
- The employer may need to copy HR and recruitment data into a separate database system to interrogate and select the data relevant to the vacancies the recruitment firm is working on.
- The selected data subsets will need to be saved and exported into files, and then transferred to the recruitment firm in compressed form.
- Upon receipt the recruitment firm could upload the files to a remote location, eg the cloud.
- Once in the cloud, the files may be loaded into a programming environment to be cleaned and used in building the AI system.
- Once ready, the data is likely to be saved into a new file to be used at a later time.
This example will be relevant to all organisations contracting external ML services, which is the predominant method for UK businesses hoping to harness the benefits of AI. The blog provides three main pieces of advice based on ongoing research into this new, wide (and widening) area of data security. They suggest that organisations should,
- Record and document the movement and storing of personal data, noting when the transfer took place, the sender and recipient, and the respective locations and formats. This will help monitor risks to security and data breaches;
- Intermediate files such as compressed versions should be deleted when required – as per best-practice data protection guidelines; and
- Use de-identification and anonymization techniques and technologies before they are taken from the source and shared either internally or externally.
Harry Smithson, May 2019