Research Challenges in Designing Differentially Private Text Generation Mechanisms
DOI:
https://doi.org/10.32473/flairs.v34i1.128461Resumen
Accurately learning from user data while ensuring quantifiable privacy guarantees provides an opportunity to build better ML models while maintaining user trust. Recent literature has demonstrated the applicability of a generalized form of Differential Privacy to provide guarantees over text queries. Such mechanisms add privacy preserving noise to vectorial representations of text in high dimension and return a text based projection of the noisy vectors. However, these mechanisms are sub-optimal in their trade-off between privacy and utility.
In this proposal paper, we describe some challenges in balancing this trade-off. At a high level, we provide two proposals: (1) a framework called LAC which defers some of the noise to a privacy amplification step and (2), an additional suite of three different techniques for calibrating the noise based on the local region around a word. Our objective in this paper is not to evaluate a single solution but to further the conversation on these challenges and chart pathways for building better mechanisms.