Differentially Private and Fair Classification via Calibrated Functional Mechanism

2020·Arxiv

Abstract

Abstract

Machine learning is increasingly becoming a powerful tool to make decisions in a wide variety of applications, such as medical diagnosis and autonomous driving. Privacy concerns related to the training data and unfair behaviors of some decisions with regard to certain attributes (e.g., sex, race) are becoming more critical. Thus, constructing a fair machine learning model while simultaneously providing privacy protection becomes a challenging problem. In this paper, we focus on the design of classification model with fairness and differential privacy guarantees by jointly combining functional mechanism and decision boundary fairness. In order to enforce differential privacy and fairness, we leverage the functional mechanism to add different amounts of Laplace noise regarding different attributes to the polynomial coefficients of the objective function in consideration of fairness constraint. We further propose an utility-enhancement scheme, called relaxed functional mechanism by adding Gaussian noise instead of Laplace noise, hence achieving -differential privacy. Based on the relaxed functional mechanism, we can design -differentially private and fair classification model. Moreover, our theoretical analysis and empirical results demonstrate that our two approaches achieve both fairness and differential privacy while preserving good utility and outperform the state-of-the-art algorithms.

Introduction

In this big data era, machine learning has been becoming a powerful technique for automated and data-driven decision making processes in various domains, such as spam filtering, credit ratings, housing allocation, and so on. However, as the success of machine learning mainly rely on a vast amount of individual data (e.g., financial transactions, tax payments), there are growing concerns about the potential for privacy leakage and unfairness in training and deploying machine learning algorithms (Fredrikson, Jha, and Ristenpart 2015; Datta, Tschantz, and Datta 2015). Thus, the problem of fairness and privacy in machine learning has attracted considerable attention.

Fairness-aware learning has received growing attentions in the machine learning field due to the social inequities and unfair behaviors observed in classification models. For example, a classification model of automated job hiring system is more likely to hire candidates from certain racial or gender groups (Giang 2018; Wachter-Boettcher 2018). Hence, substantial effort has centered on developing algorithmic methods for designing fair classification models and balancing the trade-off between accuracy and fairness, mainly including two groups: pre/post-processing methods (Dwork et al. 2012; Feldman et al. 2015; Hardt et al. 2016) and in-processing methods (Kamishima, Akaho, and Sakuma 2011; Zafar et al. 2017b). Pre/post-processing methods achieve fairness by directly changing values of the sensitive attributes or class labels in the training data. As pointed out in (Zafar et al. 2017b), pre/post-processing methods treat the learning algorithm as a black box, which can result in unpredictable loss of the classification utility. Thus, in-processing methods, which introduce fairness constraints or regularization terms to the objective function to remove the discriminatory effect of classifiers, have been shown a great success.

At the same time, differential privacy (Dwork and Roth 2014) has emerged as the de facto standard for measuring the privacy leakage associated with algorithms on sensitive databases, which has recently received considerable attentions by large-scale corporations such as Google (Erlings- son, Pihur, and Korolova 2014) and Microsoft (Ding, Kulka- rni, and Yekhanin 2017), etc. Generally speaking, differential privacy ensures that there is no statistical difference to the output of a randomized algorithm whether a single individual opts in to, or out of its input. A large class of mechanisms has been proposed to ensure differential privacy. For instance, the Laplace mechanism is employed by introducing random noise drawn from the Laplace distribution to the output of queries such that the adversary will not be able to confirm a single individual is in the input with high confidence (Dwork et al. 2006b). To design private machine learning models, more complicated perturbation mechanisms have been proposed like objective perturbation (Chaudhuri, Monteleoni, and Sarwate 2011) and functional mechanism (Zhang et al. 2012), which inject random noise into the objective function rather than model parameters.

Thus, in this paper, we mainly focus on achieving classi-fication models that simultaneously provide differential privacy and fairness. As pointed out in recent study (Xu, Yuan, and Wu 2019), achieving both requirements efficiently is quite challenging, due to the different aims of differential privacy and fairness. Differential privacy in a classification model focuses on the individual level, i.e., differential privacy guarantees that the model output is independent of whether any individual record presents or absents in the dataset, while fairness in a classification model focuses on the group level, i.e., fairness guarantees that the model predictions of the protected group (such as female group) are same to those of the unprotected group (such as male group). Lots of researches have emerged in achieving both privacy protection and fairness. Specifically, in (Dwork et al. 2012), Dwork et al. gave a new definition of fairness that is an extended definition of differential privacy. In (Hajian et al. 2015), Hajian et al. imposed fairness and k-anonymity via a pattern sanitization method. Moreover, Ekstrand et al. in (Ekstrand, Joshaghani, and Mehrpouyan 2018) put forward a set of questions about whether fairness are compatible with privacy. However, only Xu et al. in (Xu, Yuan, and Wu 2019) studied how to meet the requirements of both differential privacy and fairness in classification models by combining functional mechanism and decision boundary fairness together. Therefore, how to simultaneously meet the requirements of differential privacy and fairness in machine learning algorithms is under exploited.

In this paper, we propose Purely and Approximately Differential private and Fair Classification algorithms, called PDFC and ADFC, respectively, by incorporating functional mechanism and decision boundary covariance, a novel measure of decision boundary fairness. As shown in (Kamiran and Calders 2012), due to the correlation between input features (attributes), the discrimination of classifica-tion still exists even if removing the protected attribute from the dataset before training. Hence, different from (Xu, Yuan, and Wu 2019), which adds same scale of noise in each attribute, in PDFC, we consider a calibrated functional mechanism, i.e., injecting different amounts of Laplace noise regarding different attributes to the polynomial coefficients of the constrained objective function to ensure -differential privacy and reduce effects of discrimination. To further improve the model accuracy, in ADFC, we propose a relaxed functional mechanism by inserting Gaussian noise instead of Laplace noise and leverage it to perturb coefficients of the polynomial representation of the constrained objective function to enforce -differential privacy and fairness. Our salient contributions are listed as follows.

• We propose two approaches PDFC and ADFC to learn a logistic regression model with differential privacy and fairness guarantees by applying functional mechanism to a constrained objective function of logistic regression that decision boundary fairness constraint is treated as a penalty term and added to the original objective function.

• For PDFC, different magnitudes of Laplace noise regarding different attributes are added to the polynomial coef-ficients of the constrained objective function to enforce -differential privacy and fairness.

• For ADFC, we further improve the model accuracy by

proposing the relaxed functional mechanism based on Extended Gaussian mechanism, and leverage it to introduce Gaussian noise with different scales to perturb objective function.

• Using real-world datasets, we show that the performance of PDFC and ADFC significantly outperforms the baseline algorithms while jointly providing differential privacy and fairness.

The rest of paper is organized as follows. We first give the problem statement and background in differential privacy and fairness. Next, we present our two approaches PDFC and ADFC to achieve DP and fair classification. Finally, we give the numerical experiments based on real-world datasets and draw conclusion remarks. Due to the space limit, we leave all the proofs in the supplemental materials.

Problem Statement

This paper considers a training dataset D that includes n tuples . We also denote each tuple where the feature vector contains d attributes, i.e., , and is the corresponding label. Without loss of generality, we assumewhere , and for binary classification tasks. The objective is to construct a binary classification model with model parameters that taken x as input, can output the prediction , by minimizing the empirical loss on the training dataset D over the parameter space w of .

In general, we have the following optimization problem.

where f is the loss function. In this paper, we consider logistic regression as the loss function, i.e., f(D, w) = . Thus, the classification model has the form .

Although there is no need to share the dataset during the training procedure, the risk of information leakage still exists when we release the classification model parameter . For example, the adversary may perform model inversion attack (Fredrikson, Jha, and Ristenpart 2015) over the release model together with some background knowledge about the training dataset to infer sensitive information in the dataset.

Furthermore, if labels in the training dataset are associated with a protected attribute (note that we denote as unprotected attributes), like gender, the classifier may be biased, i.e., , where we assume the protected attribute . According to (Pedreshi, Ruggieri, and Turini 2008), even if the protected attribute is not used to build the classification model, this unfair behavior may happen when the protected attribute is correlated with other unprotected attributes.

Therefore, in this paper, our objective is to learn a binary classification model, which is able to guarantee differential privacy and fairness while preserving good model utility.

Background

In this section, we first introduce some background knowledge of differential privacy, which helps us to build private classification models. Then we present fairness definition, which helps us to enforce classification fairness.

Differential Privacy

Differential privacy is introduced to guarantee that the ability of an adversary to obtain additional information about any individual is independent of whether any individual record presents or absents in the dataset.

Definition 1 (-Differential Privacy). A randomized Mechanism A is enforced by -differential privacy, if for any two neighboring datasets , i.e., differing at most one single data sample, and for any possible output s in the output space of A, it holds that

The privacy parameter controls the strength of the privacy guarantee. A smaller value indicates a stronger privacy protection. Though differential privacy provides very strong guarantee, in some cases it may be too strong to have a good data utility. We then introduce a relaxation, ()-differential privacy, that has been proposed in (Dwork et al. 2006a).

Definition 2 (()-Differential Privacy). A randomized Mechanism A is enforced by ()-differential privacy, if for any two neighboring datasets differing at most one single data item, and for any possible output s in the output space of A, it holds that .

Laplace mechanism (Dwork and Roth 2014) and Extended Gaussian mechanism (Phan et al. 2019) are common techniques for achieving differential privacy, both of which add random noise calibrated to the sensitivity of the query function q.

Theorem 1 (Laplace Mechanism). Given any function q : , the Laplace mechanism defined by

preserves -differential privacy, where are i.i.d. random variables drawn from and -sensitivity of the query q is taken over all neighboring datasets D and .

Theorem 2 (Extended Gaussian Mechanism). Given any function and for any , the Extended Gaussian mechanism defined by

preserves -differential privacy, where are i.i.d drawn from a Gaussian distribution with

of the query q is taken over all neighboring datasets D and .

Functional Mechanism. Functional mechanism, introduced by (Zhang et al. 2012), as an extension of the Laplace mechanism is designed for regression analysis. To preserve -differential privacy, functional mechanism injects differentially private noise into the objective function f(D, w) and then publishs a noisy model parameter derived from minimizing the perturbed objective function rather than the original one. As a result of the objective function being a complex function of w, in functional mechanism, f(D, w) is represented in polynomial forms trough Taylor Expansion. The model parameter w is a vector consisting of several values . We denote as a product of , namely, for some . We also denote as the set of all products of with degree j, i.e., .According to the Stone-Weierstrass Theorem (Rudin and others 1964), any continuous and differentiable function can always be expressed as a polynomial form. Therefore, the objective function f(D, w) can be written as follows

where represents the coefficient of in polynomial.

where or t is an arbitrary tuple.

Classification Fairness The goal of classification fairness is to find a classifier that minimizes the empirical loss while guaranteeing certain fairness requirements. Many fairness definitions have been proposed for in the literature including mistreatment parity (Za- far et al. 2017a), demographic parity (Pedreshi, Ruggieri, and Turini 2008), etc.

Demographic parity, the most widely-used fairness defini-tion in the classification fairness domain, requires the decision made by the classifier is not dependent on the protected attribute z, for instance, sex or race.

Definition 3. (Demographic Parity in a Classifier) Given a classification model and a labeled dataset D, the property of demographic parity in a classifier is defined by where is the protected attribute.

Moreover, demographic parity is quantified in terms of the risk difference (RD) (Pedreschi, Ruggieri, and Turini 2012), i.e., the difference of the positive decision made in between the protected group and unprotected group. Thus, the risk difference produced by a classifier is defined as

One of the in-processing methods, called decision boundary fairness (Zafar et al. 2017b), to ensure classification fairness is to find a model parameter w that minimizes the loss function f(D, w) under a fairness constraint. Thus, the fair classification problem is formulated as follows,

where g(D, w) is a constraint term, and is the threshold. For instance, Zafar et al. (Zafar et al. 2017b) have proposed to adopt the decision boundary covariance to define the fairness constraint, i.e.,

where is decision boundary, is the average of the protected attribute and . For logistic regression classification models, the decision boundary is de-fined by . The decision boundary covariance (4) then reduces to .

Differentially Private and Fair Classiﬁcation

In this section, we first present our approach PDFC to achieve fair logistic regression with -differentially private guarantee. Then we propose a relaxed functional mechanism by injecting Gaussian noise instead of Laplace noise to provide -differential privacy. By leveraging the relaxed functional mechanism, we will show that our second approach ADFC can jointly provide -differential privacy and fairness.

Purely DP and Fair Classification In order to meet the requirements of -differential privacy and fairness, motivated by (Xu, Yuan, and Wu 2019), we consider to combine the functional mechanism and decision boundary fairness. We first consider to transform the constrained optimization problem (3) into unconstrained problem by treating the fairness constraint as a penalty term, where the fairness constraints are shifted to the original objective function f(D, w). Then, we have the new objective function defined as , where we consider as a hyperparameter to optimize the trade-off between model utility and fairness. For convenience of discussion, we set and choose

suitable values to make . Note that our theoretical results still hold if we choose other values of and . By equation (4), we have

To apply functional mechanism, we first write the approximate objective function based on (2) as follows.

where denotes the coefficient of in the polynomial of and .

The attributes involving in the dataset may not be independent from each other, which means some unprotected attributes in x are quite correlated with the protected attribute z. For instance, the protected attribute, like gender, may be correlated with the attribute, marital status. Thus, to reduce the discrimination between the protected attribute z and the labels y, it is important to weaken the correlation between these most correlated attributes and protected attribute z. However, it is often impossible to determine the degree of relation between an unprotected attribute and the protected attribute. Therefore, we randomly select an unprotected attribute and leverage functional mechanism to add noise with large scale to the corresponding polynomial co-efficients of the monomials involving . Interestingly, this approach not only helps to reduce the correlation between attributes and z, but also improve the privacy on attribute to prevent model inversion attacks, as shown in (Wang, Si, and Wu 2015).

The key steps of PDFC are outlined in Algorithm 1. We first set two different privacy budgets, and , for attribute and the rest of attributes . Before injecting noise to the coefficients, all coefficients should be separated into two groups and by considering whether involves in the corresponding monomials (i.e., whether their the coefficients contain attribute ). We then add Laplace noises drawn from and to the co-efficients of and respectively to reconstruct the differentially private objective function , where can be found in Lemma 6. Finally, the differentially private model parameter is obtained by minimizing . Note that also ensures classification fairness due to the objective function involving fairness constraint. Lemma 2. Let D and be any two neighboring datasets differing in at most one tuple. Let and be

the approximate objective function on D and , then we have the following inequality,

The following theorem shows the privacy guarantee of PDFC.

Theorem 3. The output model parameter in PDFC (Algorithm 1) preserves -differential privacy, where .

Approximately DP and Fair Classification

We now focus on using the relaxed version of -differential privacy, i.e., -differential privacy to further improve the utility of differentially private and fair logistic regression. Hence, in order to satisfy -differential privacy, we propose the relaxed functional mechanism by making use of Extended Gaussian mechanism. As shown in Theorem 2, before applying Extended Gaussian mechanism, we first calculate the sensitivity of a query function, i.e., the objective function of logistic regression , given in the following lemma.

Lemma 3 (-Sensitivity of Logistic Regression). For polynomial representations of logistic regression, two f(D, w) and given in Lemma 1, we have the following inequality

where we denote and as the set of polynomial coefficients of f(D, w) and . And we denote or as an ar- bitrary tuple.

We then perturb f(D, w) by injecting Gaussian noise

drawn from ) with log() +

tain the differentially private model parameter by minimizing the noisy function , as shown in Algorithm 2. Finally, we provide a privacy guarantee of proposed relaxed functional mechanism by the following theorem. Theorem 4. The relaxed functional mechanism in Algorithm 2 guarantees -differential privacy.

Our second approach called, ADFC, applies the relaxed functional mechanism into the objective function with decision boundary fairness constraint to enforce -differential privacy and fairness. As shown in Algorithm 3, we first derive the polynomial representation according to (6), and employ random Gaussian noise to perturb the objective function , i.e., injecting Gaussian noise into its polynomial coefficients. Furthermore, we also allocate differential privacy parameters, and for a particular unprotected attribute and the rest of unprotected attributes to improve the privacy on attribute and reduce the correlation between attributes and z. Hence, we add random noise drawn from to polynomial coefficients of . For polynomial coef-ficients in , we inject noise drawn from .

Figure 1: Compare accuracy under different privacy budgets on US)

Figure 2: Compare accuracy under different values of on US.

Lemma 4. Let D and be any two neighboring datasets differing in at most one tuple. Let and be the approximate objective function on D and , then we have the following inequality,

where we denote and as the set of polynomial coefficients of and . And we denote or as an ar- bitrary tuple.

Finally, by minimizing the differentially private objective function , we derive the model parameter , which achieves differential privacy and fairness at the same time. We now show that ADFC satisfies -differential privacy in the following theroem. Theorem 5. The output model parameter in ADFC (Algorithm 3) guarantees -differential privacy, where

Performance Evaluation

Simulation Setup

Data preprocessing We evaluate the performance on two datasets, Adult dataset and US dataset. The Adult dataset from UCI Machine Learning Repository (Dheeru and Karra Taniskidou 2017) contains information about 13 different features (e.g., work-class, education, race, age, sex, and so on) of 48,842 individuals. The label is to predict Algorithm 3 Approximately DP and Fair Classification (ADFC)

1: Input: Dataset D; The objective function f(D, w); The fairness constraint g(D, w); The privacy parameters for unprotected attribute ; The privacy parameters for other unprotected attributes .

10: Put into .

12: end for

22: end for

27: return: and .

whether the annual income of those individuals is above 50K or not. The US dataset is from Integrated Public Use Microdata Series (Center 2018) and consists of 370,000 records of census microdata, which includes features like age, sex, education, family size, etc. The goal is to predict whether the income is over 25K a year. In both datasets, we consider sex as a binary protected attribute.

Baseline algorithms In our experiments, we compare our approaches, PDFC, and ADFC against several baseline algorithms, namely, LR and PFLR*. LR is a logistic regression model. PFLR* (Xu, Yuan, and Wu 2019) is a differentially

Figure 3: Compare accuracy under different privacy budgets on Adult ().

private and fair logistic regression model that injects Laplace noise with shifted mean to the objective function of logistic regression with fairness constraint. Moreover, we compare our relaxed functional mechanism against the original functional mechanism proposed in (Zhang et al. 2012) and NoPrivacy, which is the original functional mechanism without injecting any noise to the polynomial coefficients.

Evaluation The utility of algorithms is measured by Accuracy, defined as follows,

Accuracy = Number of correct predictions Total number of predictions made,

which demonstrates the quality of a classifier. The fairness of classification models is qualified by risk difference (RD)

where z is the protected attribute. We consider a random 80-20 training-testing split and conduct 10 independent runs of algorithms. We then record the mean values and standard deviation values of Accuracy and RD on the testing dataset. For the parameters of differential privacy, we consider , and .

Results and Analysis

In Figure 1, we show the accuracy of each algorithm, functional mechanism, relaxed functional mechanism and NoPrivacy, as a function of the privacy budget with fixed . We can see that the accuracy of No-Privacy remains unchanged for all values of , as it does not provide any differential privacy guarantee. Our relaxed functional mechanism exhibits quite higher accuracy than functional mechanism in high privacy regime, and the accuracy of relaxed functional mechanism is the same as No-Privacy baseline when . Figure 2 studies the accuracy of each algorithm under different values of with fixed . Relaxed functional mechanism incurs lower accuracy when decreases, as a smaller requires a larger scale of noise to be injected in the objective function. But the accuracy of functional mechanism remains considerably lower than relaxed functional mechanism in all cases.

Figure 3(a) studies the accuracy comparison among PFLR*, LR, PDFC and ADFC on Adult dataset with the particular unprotected attribute denoted by marital status. We can observe that ADFC continuously achieves better accuracy than PFLR* in all privacy regime, and PDFC only outperforms PFLR* when is small. We also evaluate the effect of choosing different attributes as by performing experiments on Adult dataset. As shown in Figure 3(b) and Figure 3(c), choosing different attributes, marital status, age, relation and race, has different effects on the accuracy of PDFC and ADFC. However, PDFC and ADFC still outperform PFLR* under varying values of . As expected, as the value of increases, the accuracy of each algorithm becomes higher in above three figures.

Table 2 shows how different privacy budgets affect the risk difference of LR, PFLR*, PDFC and ADFC on two datasets. Note that we consider the attribute as race on Adult dataset, and work on US dataset. It is clear that PDFC and ADFC produce less risk difference compared to PFLR* in most cases of . The key reason is that adding different amounts of noise regarding different attributes indeed reduces the correlation between unprotected attributes and protected attributes.

Table 1: Risk difference with different privacy budgets on US dataset ().

Conclusion

In this paper, we have introduced two approaches, PDFC and ADFC, to address the discrimination and privacy concerns in logistic regression classification. Different from existing techniques, in both approaches, we consider leveraging functional mechanism to the objective function with decision boundary fairness constraints, and adding noise with different magnitudes into the coefficients of different attributes to further reduce the discrimination and improve the

Table 2: Risk difference with different privacy budgets on two datasets ().

privacy protection. Moreover, for ADFC, we utilize the proposed relaxed functional mechanism that is built upon Extended Gaussian mechanism, to further improve the model accuracy. By performing extensive empirical comparisons with state-of-the-art methods for differentially private and fair classification, we demonstrated the effectiveness of proposed approaches.

Acknowledgments

The work of J. Ding, X. Zhang, and M. Pan was supported in part by the U.S. National Science Foundation under grants US CNS-1350230 (CAREER), CNS-1646607, CNS-1702850, and CNS-1801925. The work of X. Li was supported in part by the Programs of NSFC under Grant 61762030, in part by the Guangxi Natural Science Foundation under Grant 2018GXNSFDA281013, and in part by the Key Science and Technology Project of Guangxi under Grant AA18242021.

References

[Center 2018] Center, M. P. 2018. Integrated public use mi- crodata series, international: Version 7.0.

[Chaudhuri, Monteleoni, and Sarwate 2011] Chaudhuri, K.; Monteleoni, C.; and Sarwate, A. D. 2011. Differentially private empirical risk minimization. Journal of Machine Learning Research 12:1069–1109.

[Datta, Tschantz, and Datta 2015] Datta, A.; Tschantz, M. C.; and Datta, A. 2015. Automated experiments on ad privacy settings. Proceedings on privacy enhancing technologies 2015(1):92–112.

[Dheeru and Karra Taniskidou 2017] Dheeru, D., and Karra Taniskidou, E. 2017. UCI machine learning repository.

[Ding, Kulkarni, and Yekhanin 2017] Ding, B.; Kulkarni, J.; and Yekhanin, S. 2017. Collecting telemetry data privately. In Advances in Neural Information Processing Systems.

[Dwork and Roth 2014] Dwork, C., and Roth, A. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9(3–4):211– 407.

[Dwork et al. 2006a] Dwork, C.; Kenthapadi, K.; McSherry, F.; Mironov, I.; and Naor, M. 2006a. Our data, ourselves: Privacy via distributed noise generation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques.

[Dwork et al. 2006b] Dwork, C.; McSherry, F.; Nissim, K.; and Smith, A. 2006b. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference.

[Dwork et al. 2012] Dwork, C.; Hardt, M.; Pitassi, T.; Rein- gold, O.; and Zemel, R. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference.

[Ekstrand, Joshaghani, and Mehrpouyan 2018] Ekstrand, M. D.; Joshaghani, R.; and Mehrpouyan, H. 2018. Privacy for all: Ensuring fair and equitable privacy protections. In Conference on Fairness, Accountability and Transparency.

[Erlingsson, Pihur, and Korolova 2014] Erlingsson, ´U.; Pihur, V.; and Korolova, A. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In 21st ACM Conference on Computer and Communications Security.

[Feldman et al. 2015] Feldman, M.; Friedler, S. A.; Moeller, J.; Scheidegger, C.; and Venkatasubramanian, S. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

[Fredrikson, Jha, and Ristenpart 2015] Fredrikson, M.; Jha, S.; and Ristenpart, T. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.

[Giang 2018] Giang, V. 2018. The potential hidden bias in automated hiring systems.

[Hajian et al. 2015] Hajian, S.; Domingo-Ferrer, J.; Mon- reale, A.; Pedreschi, D.; and Giannotti, F. 2015. Discrimination-and privacy-aware patterns. Data Mining and Knowledge Discovery 29(6):1733–1782.

[Hardt et al. 2016] Hardt, M.; Price, E.; Srebro, N.; et al. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems.

[Kamiran and Calders 2012] Kamiran, F., and Calders, T. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33(1):1–33.

[Kamishima, Akaho, and Sakuma 2011] Kamishima, T.; Akaho, S.; and Sakuma, J. 2011. Fairness-aware learning through regularization approach. In 2011 IEEE 11th International Conference on Data Mining Workshops.

[Pedreschi, Ruggieri, and Turini 2012] Pedreschi, D.; Rug- gieri, S.; and Turini, F. 2012. A study of top-k measures for discrimination discovery. In The 27th Annual ACM Symposium on Applied Computing.

[Pedreshi, Ruggieri, and Turini 2008] Pedreshi, D.; Ruggieri, S.; and Turini, F. 2008. Discrimination-aware data mining. In 14th ACM SIGKDD international conference on Knowledge discovery and data mining.

[Phan et al. 2019] Phan, N.; Vu, M. N.; Liu, Y.; Jin, R.; Dou, D.; Wu, X.; and Thai, M. T. 2019. Heterogeneous gaussian mechanism: Preserving differential privacy in deep learning with provable robustness. arXiv preprint arXiv:1906.01444.

[Rudin and others 1964] Rudin, W., et al. 1964. Principles of mathematical analysis. McGraw-hill New York.

[Wachter-Boettcher 2018] Wachter-Boettcher, S. 2018. Ai recruiting tools do not eliminate bias.

[Wang, Si, and Wu 2015] Wang, Y.; Si, C.; and Wu, X. 2015. Regression model fitting under differential privacy and model inversion attack. In 24th International Joint Conference on Artificial Intelligence.

[Xu, Yuan, and Wu 2019] Xu, D.; Yuan, S.; and Wu, X. 2019. Achieving differential privacy and fairness in logistic regression. In Companion Proceedings of The 2019 World Wide Web Conference.

[Zafar et al. 2017a] Zafar, M. B.; Valera, I.; Gomez Ro- driguez, M.; and Gummadi, K. P. 2017a. Fairness beyond disparate treatment & disparate impact: Learning classifica-tion without disparate mistreatment. In The 26th International Conference on World Wide Web.

[Zafar et al. 2017b] Zafar, M. B.; Valera, I.; Rodriguez, M. G.; and Gummadi, K. P. 2017b. Fairness constraints: Mechanisms for fair classification. In The 20th International Conference on Artificial Intelligence and Statistics.

[Zhang et al. 2012] Zhang, J.; Zhang, Z.; Xiao, X.; Yang, Y.; and Winslett, M. 2012. Functional mechanism: regression analysis under differential privacy. Proceedings of the VLDB Endowment 5(11):1364–1375. Lemma 6. Let D and be any two neighboring datasets differing in at most one tuple. Let and be the approximate objective function on D and , then we have the following inequality,

Proof. Assume that D and differ in the last tuple and . We have that ∆

4 + 3d where represents e-th element in feature vector x. Theorem 8. The output model parameter in PDFC (Algorithm 1) guarantees -differential privacy, where d. Proof. We assume there are two neighboring datasets D and that differ in the last tuple and . As shown in the Al- gorithm 1, all polynomial coefficients are divided into two

subsets and in view of whether they include sensitive attribute or not. After adding Laplace noise, we have

Then, the following inequality holds Pr

1)/d) In the last second equality, we directly adopt the result in (Wang, Si, and Wu 2015). Lemma 7 (-Sensitivity of Logistic Regression). For polynomial representations of logistic regression, two f(D, w) and given in Lemma 1, we have the following inequality

where we denote and as the set of polynomial coefficients of f(D, w) and . And we denote or as an ar- bitrary tuple. Proof. Assume that D and differ in the last tuple and . For logistic regression, we have

Denote and as the set of polynomial coefficients of f(D, w) and , and

where t is an arbitrary tuple. Theorem 9. The relaxed functional mechanism in Algorithm 2 guarantees -differential privacy. Proof. Assume that the neighboring datasets D and differ in the last tuple and .

where and .

We know the fact that the distribution of a spherically symmetric normal is not dependent of the orthogonal basis where its constituent normals are drawn. Thus, we work in a basis aligned with B. Fix such a basis and draw A by first drawing signed lengths for , then let and . Without loss of generality, we assume that is parallel to B. Based on the triangle with the base and the edge orthogonal to B, apparently, we have . Since , we have

. When , the privacy loss is bounded by , i.e., . Next, we need to prove that the privacy loss is bounded by with probability at least , which requires . Now we use the tail bound of , we have

Based on the proof above, we know that the privacy lossis bounded by with probability at least , which represents the the computation of satisfies -differential privacy. Therefore, Algorithm 2 satisfies -differential privacy. Lemma 8. Let D and be any two neighboring datasets differing in at most one tuple. Let and be the approximate objective function on D and , then we have the following inequality,

where we denote and as the set of polynomial coefficients of and . And we denote or as an ar- bitrary tuple. Proof. Assume that D and differ in the last tuple and . For objective , we have

Denote and as the set of polynomial coefficients of and , and

where t is an arbitrary tuple. Theorem 10. The output model parameter in ADFC (Algorithm 3) guarantees -differential privacy, where and .

Proof. Assume that the neighboring datasets D and differ in the last tuple and .

where we let , , and .Now we will use the fact that the distribution of a spherically symmetric normal is not dependent of the orthogonal basis where its constituent normals are drawn. Thus, we work in two basis aligned with and separately. Fix the basis of and draw by first drawing signed lengths for , then let and . Fix the basis of and draw by first drawing signed lengths for , then let and .

Without loss of generality, we assume that is parallel to and is parallel to . Based on the triangle with the base and the edge orthogonal to , we have . Since , we have

gle with the base and the edge or- thogonal to , we have . Since , we have

When set || ≤ − 1) and || ≤ , we have

d

. Thus, we have the pri- vacy loss

≥ Now we will give the upper bound of by using the tail bound of . Hence, we have . By letting in the above inequality, we have

is very small, we have . Thus, we can prove that . In the same way, we can prove . Therefore, if we let , we have Pr

which proves that the computation of satisfies -differential privacy. Apparently, the final result also satisfies -differential privacy.