Finished a project on sensitive data detection with the UN

We’ve collaborated with the UN on contextually sensitive data detection in tabular data!


Together with a great team of the UN Humanitarian Data Centre, Liang Telkamp (MSc) and Madelon Hulsebos (PI) developed new mechanisms for detecting contextually sensitive data in tabular datasets with the use of information retrieval, LLMs and data semantics. Early evaluations showed promising performance of our techniques, which outperform existing commercial tools at large.

Liang has completed her master thesis (read a report here) with us and will now work at the UN to deploy the methods we developed in the Humanitarian Data Exchange (HDX). Great to see the societal impact of our research materialize! Madelon will present and discuss the findings during the UN’s expert meeting on statistical data diclosure in Barcelona in October. While our research focused on contextual sensitive data detection in humanitarian datasets, the mechanisms are also applicable in other contexts such as healthcare, enterprises, and governments.

This project was also covered in numerous scientific news outlets: CWI, Amsterdam AI, and Computable.