Exploring the use of data-analytics technologies in the fight against child exploitation is for the brave. Tackling such a complex and multifaced issue may appear overwhelming when all the legal, regulatory, social, and economic aspects are considered. Children are vulnerable by default, and those who unfortunately end up in contexts of sexual and non-sexual exploitation are among the most vulnerable individuals in society. In addition, children who live in certain societal groups are disproportionately at risk of exploitation. Moreover, the economic and social context surrounding these children, including their network of acquaintances, is peculiar and delicate, and require extra care when algorithms are designed to ensure that they are accurate, efficient, and are not prone to biases of any kind.
These aspects make it particularly difficult to design a fit-for-purpose, data-driven solution to support police forces in the fight against child exploitation, and is among the many reasons why the number of players remain limited.
The regulatory rollercoaster for data-analytics organisations starts however well before the above issues are even considered, and revolves around a key legal subject: personal data protection.
The European Union – and the United Kingdom by virtue of its former EU membership – are equipped with one of the most advanced, if not the most advanced, data-protection regimes on the planet. Individuals have a fundamental right to the protection of all information about them, and data processing is prohibited, unless stringent requirements are fulfilled.
To develop a data-driven solution to counter child exploitation, all stakeholders ranging from the social services and police forces to the very data scientists working on the algorithm must assess whether they are allowed to collect, share, use, and store data of minors and their contacts.
In the context of the CESIUM project, Trilateral Research has leveraged a decennial data-protection and research-ethics experience to ensure that societal impact can be achieved without impacting the rights and freedoms of individuals.
Step 1: Synthetic data
The first step in tackling data protection in such a project is to assess whether personal data is required in the first place.
Should the objective be achievable without the use of personal data, i.e., data about a living individual, the project would not be subject to data protection laws.
After careful consideration, our data scientists and our ethics experts have determined that synthetic data would be useful, but not sufficient to developed a fit-for-purpose solution, since synthetic data is created with the use of algorithms and could inherit the algorithms’ biases.
Step 2: Identification of the relevant personal data
Once the exclusive use of synthetic data has been ruled out, the Trilateral Research team members have worked with our law-enforcement partner to identify what personal data was necessary and relevant to achieve the purpose.
The relevant information streams have been identified and an inventory created with a list of each data item required for the research, and the databases and/or documents where these data items are stored.
Step 3: Data Protection Impact assessment
Once the relevant data has been identified, the Data Protection Impact Assessment has been initiated, to help in the design of the project work packages with a look at minimizing risks by design.
The DPIA for CESIUM has been maintained and updated throughout the project, but served as a key document for guiding both data-protection and non-data-protection actions in the project.
Step 4: Data sharing agreements
Data sharing agreements do not mitigate personal-data risks by themselves. However, they are a critical piece of data-protection compliance and ensure that the data controller can exercise their role at the fullest extent, ultimately protecting their own data subject from harm.
The organisation providing the data science is therefore bound to high security and processing standards, contributing to a fair and legitimate use of personal data for this important purpose.
Step 5: tackling emerging problems – the example of data silos and masking
One of the challenges of data analytics is that, typically, more data equals better opportunities for analysis and, hopefully, a more accurate assessment. This means that, at least at the development stage, data scientists hope to obtain data from multiple sources. This comes with additional challenges, especially when combining datasets would create an extensive profile of the individual. Also, it may create issues for a future deployment of the solution, since data may become visible to other organisations. To avoid this, the Trilateral Research team has studied solutions to silo the data and mask data to avoid access by other organisations.
Data management and erasure
Finally, the Trilateral Research teams have followed existing good practice and the data-controller policies in the day-to-day management of the personal data shared for the purposes of the project. This includes the use of specialised, dedicated hardware for data processing, and compliance with rules and arrangements for data erasure at the end of the project.
The CESIUM application is unique in that it takes into account the ethical and data-protection issues associated with data-driven decision making in policing. By facilitating the early identification of child exploitation, CESIUM will be an invaluable data-driven, ethically-sound, privacy-conscious support to the work of law enforcement in the UK and overseas to protect the most vulnerable in our societies. For more information, please feel free to contact our staff, who would be more than happy discuss your needs and collaboration opportunities.