Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy

Xiaosu Tong; Bowei Xi; Murat Kantarcioglu; Ali Inan

doi:10.1007/978-3-319-61176-1_7

Conference Papers Year : 2017

Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy

(1) , (2) , (3) , (4)

1
2
3
4

Xiaosu Tong

Function : Author
PersonId : 1026625

Amazon

Bowei Xi

Function : Author
PersonId : 1026626

Purdue University [West Lafayette]

Murat Kantarcioglu

Function : Author
PersonId : 1010022

Department of Computer Science [Dallas]

Ali Inan

Function : Author
PersonId : 1026627

Adana Science and Technology University

Abstract

Many statistical models are constructed using very basic statistics: mean vectors, variances, and covariances. Gaussian mixture models are such models. When a data set contains sensitive information and cannot be directly released to users, such models can be easily constructed based on noise added query responses. The models nonetheless provide preliminary results to users. Although the queried basic statistics meet the differential privacy guarantee, the complex models constructed using these statistics may not meet the differential privacy guarantee. However it is up to the users to decide how to query a database and how to further utilize the queried results. In this article, our goal is to understand the impact of differential privacy mechanism on Gaussian mixture models. Our approach involves querying basic statistics from a database under differential privacy protection, and using the noise added responses to build classifier and perform hypothesis tests. We discover that adding Laplace noises may have a non-negligible effect on model outputs. For example variance-covariance matrix after noise addition is no longer positive definite. We propose a heuristic algorithm to repair the noise added variance-covariance matrix. We then examine the classification error using the noise added responses, through experiments with both simulated data and real life data, and demonstrate under which conditions the impact of the added noises can be reduced. We compute the exact type I and type II errors under differential privacy for one sample z test, one sample t test, and two sample t test with equal variances. We then show under which condition a hypothesis test returns reliable result given differentially private means, variances and covariances.

Keywords

Differential privacy Statistical database Mixture model Classification Hypothesis test

Domains

Computer Science [cs]

Fichier principal

453481_1_En_7_Chapter.pdf (271)

Origin	Files produced by the author(s)

Hal Ifip : Connect in order to contact the contributor

https://inria.hal.science/hal-01684357

Submitted on : Monday, January 15, 2018-2:07:21 PM

Last modification on : Monday, January 15, 2018-2:11:12 PM

Long-term archiving on : Tuesday, May 8, 2018-1:42:46 AM

Dates and versions

hal-01684357 , version 1 (15-01-2018)

Licence

Attribution

Identifiers

HAL Id : hal-01684357 , version 1
DOI : 10.1007/978-3-319-61176-1_7

Cite

Xiaosu Tong, Bowei Xi, Murat Kantarcioglu, Ali Inan. Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy. 31th IFIP Annual Conference on Data and Applications Security and Privacy (DBSEC), Jul 2017, Philadelphia, PA, United States. pp.123-141, ⟨10.1007/978-3-319-61176-1_7⟩. ⟨hal-01684357⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP-LNCS IFIP IFIP-TC IFIP-WG IFIP-TC11 IFIP-WG11-3 IFIP-DBSEC IFIP-LNCS-10359

393 View

189 Download

Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share