Fairness Implications of Data Minimization in Deep Collaborative Filtering
DOI:
https://doi.org/10.32473/flairs.39.1.141772Keywords:
Fairness in ML, Recommender Systems, Active Learning, Data MinimizationAbstract
Data Minimization, a core principle of the General Data Protection Regulation (GDPR), requires limiting personal data ``[...] to the purpose for which they are processed.'' However, there is still not a clear definition of data minimization, and indeed, its algorithmic implications for machine learning remain insufficiently understood. This gap is particularly notable in the research area of Recommender Systems (RS). RS rely on large-scale data collection and processing. It remains unclear how data minimization should be implemented in such models. This is particularly important since any limitation on data may affect accuracy and system fairness, due to disproportionate data processing across different user groups. In this paper we study the practical implications of data minimization in RS. We analyze the performance of RS when operationalizing data minimization via Active Learning (AL). A set of commonly-used AL strategies are implemented and then thorough empirical evaluations are conducted on them with respect to accuracy and fairness. To generate recommendations, we use a popular type of RS, namely deep Collaborative Filtering, which utilizes state-of-the-art deep learning methods to learn from user data. Our results demonstrate that depending on the type of RS, certain AL strategies are able to improve the model performance to a greater extent. Nonetheless, all the AL strategies negatively affect fairness, leading to trade-offs in implementing data minimization for RS.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Sipei Li, Nasim Sonboli, Mehdi Elahi, Alva Couch

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.