Whose Data Is It Anyway? Towards a Formal Treatment of Differential Privacy for Surveys
Abstract
"This paper develops theory for understanding and implementing differential privacy in the context of survey statistics. By recognizing the major phases in the survey-data pipeline, we identified ten different settings of DP. These settings correspond to different choices for 1) where the DP data-release mechanism starts in this pipeline; and for 2) which of the previous phases are taken as invariant. Section 3 formalized these ten settings into ten different conditions on the DP flavor. Sections 4 and 5 show that the choice of the setting has significant impacts in terms of both privacy and utility. Therefore, while DP is invariant to post-processing, pre-processing steps matter. Moreover, the data custodian must necessarily choose a setting – they cannot implement DP without first deciding (perhaps implicitly) where the DP mechanism starts and which pre-processing steps are taken as invariant. Hence, contrary to commonly-held beliefs, DP does make important assumptions on the data and on the attacker, because the data custodian’s decision impacts both the utility and privacy semantics of the DP-outputted data." (Text excerpt, IAB-Doku) ((en))
Cite article
Bailie, J. & Drechsler, J. (2024): Whose Data Is It Anyway? Towards a Formal Treatment of Differential Privacy for Surveys. In: National Bureau of Economic Research (2024): Data Privacy Protection and the Conduct of Applied Research: Methods, Approaches and their Consequences, Spring 2024, Washington, p. 1-33.
Download
Further information
Hier finden Sie die einzelnen Konferenzbeiträge kostenlos verfügbar.