A Consensus Privacy Metrics Framework for Synthetic Data
Abstract
"Synthetic data generation is a promising approach for sharing data for secondary purposes in sensitive sectors. However, to meet ethical standards and legislative requirements, it is necessary to demonstrate that the privacy of the individuals upon which the synthetic records are based is adequately protected. Through an expert consensus process, we developed a framework for privacy evaluation in synthetic data. The most commonly used metrics measure similarity between real and synthetic data and are assumed to capture identity disclosure. Our findings indicate that they lack precise interpretation and should be avoided. There was consensus on the importance of membership and attribute disclosure, both of which involve inferring personal information. The framework provides recommendations to effectively measure these types of disclosures, which also apply to differentially private synthetic data if the privacy budget is not close to zero. We further present future research opportunities to support widespread adoption of synthetic data." (Author's abstract, IAB-Doku, © Elsevier) ((en))
Cite article
Pilgram, L., Dankar, F., Drechsler, J., Elliot, M., Domingo-Ferrer, J., Francis, P., Kantarcioglu, M., Kong, L., Malin, B., Muralidhar, K., Myles, P., Prasser, F., Raisaro, J., Yan, C. & El Emam, K. (2025): A Consensus Privacy Metrics Framework for Synthetic Data. In: Patterns. DOI:10.1016/j.patter.2025.101320