Loading...
Thumbnail Image

Including fairness in recommendation systems by decorrelation with sensitive variables: analysis of its impact on the factorization error

(2022)

Files

Bodehou_08911501_2021.pdf
  • Open access
  • Adobe PDF
  • 1.95 MB

Details

Supervisors
Faculty
Degree label
Abstract
Recommendation systems are filtering algorithms developed to predict the preference of a user for a given item. Those systems may be categorized into two groups. In the first group, i.e. collaborative filtering, it is assumed that people who agreed in the past will also agree in the future. The second approach, content based filtering, relies on the expansion of each item into a set of characteristics. In this thesis, the focus is put on the first approach, i.e. collaborative based filtering because this class of methods has the advantage of not requiring an understanding of the items characteristics. However, although collaborating filtering has proven to be able to provide good performance in term of prediction error, those algorithms tend to reproduce or amplify in the predictions possible correlations or discriminations (with respect to some sensitive variables) existing in the training data set. This characteristic may cause in certain circumstances, ethical issues. Therefore, it is in some cases preferable to make the predictions, independent with respect to some sensitive variables so as to introduce more fairness in the recommendation. This additional constraint may (or not) be achieved at the expense of a worse prediction error as compared to the solution with discriminations. The goals of this thesis are twofold. First, two main collaborative filtering algorithms (matrix factorization based method, and the K-nearest neighbors algorithm) are reviewed and compared against each other. Second, fairness constraint is introduced, and its impact on the prediction error is closely analyzed. Ability of the developed algorithm to reduce possible discriminations is demonstrated through various discrimination measures, namely the covariance, and the correlation.