Some clarifications regarding fully synthetic data
Abstract
"There has been some confusion in recent years in which circumstances datasets generated using the synthetic data approach should be considered fully synthetic and which estimator to use for obtaining valid variance estimates based on the synthetic data. This paper aims at providing some guidance to overcome this confusion. It offers a review of the different approaches for generating synthetic datasets and discusses their similarities and differences. It also presents the different variance estimators that have been proposed for analyzing the synthetic data. Based on two simulation studies the advantages and limitations of the different estimators are discussed. The paper concludes with some general recommendations how to judge which synthesis strategy and which variance estimator is most suitable in which situation." (Publisher information, © Springer) ((en))
Cite article
Drechsler, J. (2018): Some clarifications regarding fully synthetic data. In: J. Domingo-Ferrer & F. Montes (Hrsg.) (2018): Privacy in statistical databases : UNESCO Chair in Data Privacy International Conference, PSD 2018 Valencia, Spain, September 26 - 28, 2018 Proceedings (Lecture Notes in Computer Science, 11126), p. 109-121. DOI:10.1007/978-3-319-99771-1_8