Algorithmic fairness Recommender systems

Practical perspectives of consumer fairness in recommendation

The mitigation of consumer fairness assumes that recommendations bring equitable effectiveness for the different demographic groups of users. Mitigation approaches can be analyzed from, multiple, technical perspectives. Different mitigation strategies at the state of the art offer different properties.

In a study, published by the Information Processing and Management journal (Elsevier) and conducted with Gianni Fenu, Mirko Marras, and Giacomo Medda, we investigated the properties a given mitigation procedure against consumer unfairness should be evaluated on, to provide a more holistic view on its effectiveness. We first identified eight technical properties and evaluated the extent to which existing mitigation procedures against consumer unfairness met these properties, qualitatively and quantitatively (when possible), on two public data sets. Then, we outlined the main trends and open issues emerged from our multi-dimensional analysis and provided key practical recommendations for future research.

The source code accompanying this paper is available at https://github.com/jackmedda/Perspective-C-Fairness-RecSys.

This study extends our reproducibility effort in this area, originally published in our ECIR 2022 paper. We consider the same set of approaches and define properties that allow us to analyze these mitigation strategies.

Mitigation procedures benchmark

The design process for these properties was based on the adopted practices in the current academic literature about mitigation procedures for recommender systems (when possible) or those for general machine-learning models for completeness.

Property 1. Applicability indicates the extent to which a mitigation procedure can be technically run on a wide range of different recommendation models without requiring any substantial change to the fundamental steps it is based on.

Property 2. Coherence indicates the extent to which a mitigation procedure tends to reduce the biased outcomes for the originally disadvantaged group, without reversing the disparate outcome towards the other group(s).

Property 3. Consistency indicates the ability of a mitigation procedure to substantially reduce the model’s unfairness according to the pursued fairness notion, given any data set and any consumer grouping method.

Property 4. Data robustness indicates the ability of a mitigation procedure to reduce unfairness also in challenging cases related to data distribution (e.g., imbalances) and relationships between unfairness and other features.

Property 5. Reproducibility indicates the ability of taking the original source code that implements a mitigation procedure and being able to execute it under the same or a different evaluation protocol, with respect to the one used in the original paper.

Property 6. Scalability indicates the ability of a mitigation procedure to scale well when the number of interactions, users, items, and sensitive attributes, and other relevant features increases consistently.

Property 7. Trade-off management indicates the ability of a mitigation procedure to preserve the performance estimate achieved by the target recommendation model originally (before the mitigation was applied).

Property 8. Transferability indicates the ability of a mitigation procedure to be effective (and not only applicable) on a wide range of recommendations models, even those it was not originally designed for or tested on.

While our paper contains a detailed analysis of each consumer-fairness mitigation approach these perspectives, here we provide a summary of our results.

For each property and mitigation procedure, we assigned one of the two following labels: Higher when the corresponding work was better than the others on average for the selected property, Lower otherwise. A blank entry was left for studies that could not be evaluated in terms of the corresponding property. For papers whose source code was not available, the corresponding mitigation procedure was only analyzed in terms of applicability and scalability.

The first two rows refer to the works that reported the highest number of above-average properties. These two studies performed better on most of the properties, e.g., on trade-off management and data robustness. Burke et al.’s mitigation procedure did not perform well on several properties, according to our taxonomy and framework.

Our findings in this work are expected to represent a guideline for researchers working on mitigating consumer unfairness. However, some properties could not always be evaluated. For instance, evaluating transferability would require extensive changes of the original recommender system for some mitigation procedures. Similarly, coherence could be affected by the performed data split or by the considered fairness metric. Furthermore, data robustness would be better suitable for mitigation procedures that consider causal relations with unfairness and use sensitive attributes only for assessment. Nevertheless, future works should acknowledge these circumstances.