The effect of homophily on disparate visibility of minorities in people recommender systems

Demographics and homophily are the main drivers behind people recommendation in social networks and can affect the visibility that is given to users when they are recommended. These phenomena mainly impact users who belong to the minority groups, which have lower possibilities of being recommended, unless they are highly homophilic.

In a recent ICWSM 2020 paper, with Francesco Fabbri, Francesco Bonchi, and Carlos Castillo, we assessed the existence of disparate visibility phenomena occurring in social people recommender systems. By splitting users into groups based on their sensitive attributes (gender and age), we studied disparate visibility phenomena. Disparate visibility occurs if the users in that group are recommended more or less than their representation in the data. In other words, we expect the users in a group to be recommended according to their proportion in the data.

To assess these phenomena, we considered data coming from two real-world social networks (TUENTI and POKEC) and analyzed three state-of-the-art people recommender systems (Adamic-Adar, SALSA, and ALS), plus a random-based approach. We measured the visibility given to the different groups and saw if and to what extent disparities occurred for users who belonged to different demographic groups.

Remark. In our work, we also performed an extensive analysis on synthetic data, not reported in this blog post, but available in the original paper.

Disparate visibility assessment

Table 1 presents the disparate visibility returned by the different recommenders in each graph, where s_m denotes the size of the minority group and h_m the homophily of the same group. Results show that in graphs with homophilic minorities, there is disparate visibility in favor of the minority class. When the minority is not homophilic, the disparate visibility is in favor of the majority class.

Network	Method	Δ(V)
TUENTI-A16 s_m = 0.3 h_m = 0.42	ALS SLS ADA RND	0.517 0.264 0.134 0.149
POKEC-A21 s_m = 0.46 h_m = 0.34	ALS SLS ADA RND	0.900 0.571 0.328 0.310
TUENTI-A30 s_m = 0.04 h_m = 0.08	ALS SLS ADA RND	-0.276 -0.350 -0.359 -0.333
TUENTI-G s_m = 0.39 h_m = 0.02	ALS SLS ADA RND	-0.264 -0.291 -0.212 -0.149

Table 1. Disparate visibility, Δ(V), introduced by the different recommendation models.

Rich-get-richer effect

Lorenz Curves depicting inequality. Dashed lines represent recommendations, solid lines represent in-degree. The minority is in red, the majority in blue. Recommendations introduce more inequality than the degree distribution, and this inequality is stronger in the minority class.

To study the impact of recommenders on the individual nodes, we contrast their in-degree in the original data and the visibility that is given to each node by a recommendation model. On the right, we depict these phenomena via Lorenz Curves, which are a popular graphical tool to show the cumulative distribution of a variable inside a population, emphasizing the differences with respect to a hypothetical random distribution. Specifically, the figure reports the results of the SALSA algorithm when considering the TUENTI dataset and attribute gender (see the original paper for more algorithms, datasets, and demographic groups). We can observe that recommenders amplify the rich-get-richer phenomena observed for in-degree, thus introducing more inequality in terms of visibility. Such observed inequality is stronger within the minority class compared to the majority class, especially when the minority is homophilic.

Individual fairness

Portion of the minority class in the top nodes, sorted by number of recommendations divided by in-degree.

Our final analysis on real-world data considers individual fairness, i.e., the principle according to which similar individuals should receive a similar treatment. In our setting, being similar means having similar in-degree (e.g., a similar number of “followers” in a social networking site). We sort nodes by the number of times a node is recommended divided by its in-degree, and observe how the different models behave. Specifically, the figure on the left reports the results of the TUENTI dataset and demographic groups based on attribute gender. It shows that contrarily to what is seen in the previous figure, if we normalize by in-degree then nodes in the minority group are under-represented among top nodes, regardless of the level of homophily of the minority. For instance, if we take the top-40% nods, then the fraction of nodes belonging to the minority class is always below the dotted line, which represents the relative size of the minority in the network. Hence, when taking into account in-degree, nodes in the minority class are disadvantaged in terms of visibility, regardless of the homophily of the minority class. In other words, among nodes with similar in-degree, the ones that belong to the majority class are likely to be recommended more.

Take-home messages

The main take-home message of this work is that homophily plays a key role in the visibility that is given to a group, sometimes regardless of the fact that this group may be a minority in the network.

We highlight algorithmic biases expressed in terms of visibility, in a static “single round” of recommendations. In our current work, we are studying the long-term effect of the algorithms, where the graph evolves dynamically and repeated recommendations are generated. This opens to scenarios where homophily may change over time as well as with the partition of minority-majority.

Finally, this work sheds light on some key ethical aspects to consider in the design of social networking products. Embracing these insights would lead to new mitigation strategies, able to control disparate visibility. We plan to develop in- and post-processing algorithms in this direction, evaluating them under various homophily and group fairness definitions.