Skip to content

Publication

Working with Synthetic Data: The Dos, Dangers, and Don’ts

Abstract

"In this How-to Guide, we want to help you understand what synthetic data are. When are such data useful? And what does it mean to protect the privacy of individuals in the original data set? Accounting for the tradeoff between utility and privacy, we outline why generating and using synthetic data require great care. Synthetic data can be valuable when stringent data privacy requirements are not a primary issue. In other cases, privacy is a Major - and often underappreciated - concern. Generating synthetic data then requires considerable thought to protect the privacy of observations in the data. Typically, data synthesizers would add statistical noise when generating synthetic data, which deteriorates the utility of the synthetic data. Under no circumstances must analysts generate and rely on synthetic data without giving clear thought to utility and privacy. This guide offers clear guidelines on working with synthetic data and navigating these sometimes treacherous waters." (Author's abstract, IAB-Doku, © Sage) ((en))

Cite article

Arnold, C.; Neunhoeffer, M. (sonst. bet. Pers.) (2025): Working with Synthetic Data: The Dos, Dangers, and Don’ts. In: Sage (ed.) (2025): Sage Research Methods: Data and Research Literacy, o. Sz. DOI:10.4135/9781036222734