Robustness in Fairness against Edge-level Perturbations in GNN-based Recommendation

Edge-level perturbations impact the robustness and fairness of graph-based recommender systems, revealing significant vulnerabilities and the need for more resilient design approaches.

In our paper, which will be presented at the ECIR 2024 conference, we delve into the robustness of graph-based recommendation systems against edge-level perturbations. This work is a collaborative effort with Francesco Fabbri, Gianni Fenu, Mirko Marras, and Giacomo Medda.

Methodology

We learn user preferences from past interactions using an undirected bipartite graph representing user-item interactions. The adjacency matrix from this graph is utilized by a Graph Neural Network (GNN) to predict missing links and recommend items to users.

Our perturbation task involves altering the adjacency matrix to evaluate the robustness of the system. The study defines a model as (γ, ε)-robust if it shows a minimal disparity between performance with original and perturbed data, with performance measured by a fairness metric.

Graph perturbation mechanism. We employ a sparsification method to create binary perturbation tensors that modify the adjacency matrix, aiming to disrupt the predicted recommendation lists subtly and observe the changes in fairness metrics. The method involves iterative alterations to optimize the perturbed edges for the desired task, using a defined objective function that maximizes disparity in fairness.

Fairness notion and operationalization. Our study the concept of demographic parity, which requires equal recommendation utility across different consumer and provider groups. We operationalize demographic parity in the following ways:

Consumer Preference (CP): Measured by the disparity in rank-aware top-k recommendation utility, approximated using a differentiable function.
Consumer Satisfaction (CS): A rank-agnostic approach measuring consumer fairness, optimized through binary classification techniques.
Provider Exposure (PE): It estimates provider fairness based on the disparity in exposure across provider groups.
Provider Visibility (PV): Similar to PE but focuses on the disparity in visibility across provider groups.

Experimental Evaluation

The evaluation protocol includes 200 epochs of perturbation process, focusing on edge perturbations’ impact on fairness robustness. The process stops early if changes in the fairness disparity metric (∆) are insignificant over 15 epochs. The models tested include GCMC, LightGCN (LGCN), and NGCF from Recbole, selected for their distinct recommendation mechanisms and perturbation response. The datasets employed are MovieLens-1M, LFM-1K, and Insurance, with various user-item interactions and domains, emphasizing diverse experimental conditions.

Impact on Robustness in Fairness (RQ1). The experiments measure how edge perturbations affect the robustness in fairness using different fairness operationalizations. The findings indicate that the fairness disparity can be significantly increased or decreased, depending on the perturbation type and model. In general, edge deletions (∔ Del) are more effective in disrupting fairness, especially in terms of consumer-side fairness. The results show a nuanced response across models and datasets, reflecting the complexity of predicting and controlling the impact of perturbations in these systems.

Robustness in Fairness under Incremental Perturbations (RQ2). We delve deeper into how gradual increments in edge perturbations influence the robustness in fairness over the entire perturbation process. The study presents a dynamic view of fairness disruption, showing that the impact of perturbations can vary significantly across different iterations and settings. The robustness varies across models, with some showing more resilience to incremental perturbations than others, and the extent of the impact also depends on the specific fairness operationalization employed.

Edge Perturbations Influence on Groups (RQ3). We explore how different consumer and provider groups are affected by edge perturbations, focusing on whether the advantaged or disadvantaged groups are more impacted. Our findings indicate that successful attacks often target the advantaged group to maximize unfairness. However, the disparity in edge perturbations doesn’t always translate into a significant impact on fairness levels, especially on the provider side, due to the initial degree of unfairness.

Concluding remarks

We emphasized the critical importance of addressing robustness in fairness within graph-based recommender systems. Our findings highlight the significant impact of edge-level perturbations on the fairness and overall stability of these systems, underscoring the urgent need for more robust and fair recommendations. We believe that future research should include exploring a wider array of models, testing in various settings, and considering multi-labeled attributes. These efforts are essential to enhance the robustness and fairness of recommender systems, ensuring they serve all users equitably and maintain integrity even in the face of adversarial manipulations.