Project

Private Data Release using Sanitizer

We propose Sanitizer, a framework that protects against sensitive information leakage to facilitate task independent data release with untrusted parties. This is done in a two-step process: first, we develop a framework that encodes unstructured data into a structured representation bifurcated by sensitive and non-sensitive representation. Second, we design mechanisms that transform the sensitive features such that the leakage of sensitive information is minimal. Instead of removing sensitive information from the unstructured data, we replace the sensitive features by sampling synthetic sensitive features from the joint distribution of the sensitive features in its structured representation. Hence, using this method one can share a sanitized dataset that preserves distribution with the original dataset resulting in a good utility-privacy trade-off. We compare our technique against state-of-the-art baselines and demonstrate competitive empirical results both quantitatively and qualitatively.

We propose Sanitizer, a framework that protects against sensitive information leakage to facilitate task independent data release with untrusted parties. This is done in a two-step process: first, we develop a framework that encodes unstructured data into a structured representation bifurcated by sensitive and non-sensitive representation. Second, we design mechanisms that transform the sensitive features such that the leakage of sensitive information is minimal. Instead of removing sensitive information from the unstructured data, we replace the sensitive features by sampling synthetic sensitive features from the joint distribution of the sensitive features in its structured representation. Hence, using this method one can share a sanitized dataset that preserves distribution with the original dataset resulting in a good utility-privacy trade-off. We compare our technique against state-of-the-art baselines and demonstrate competitive empirical results both quantitatively and qualitatively.