Enhancing recommender systems with provider fairness through preference distribution awareness

In multi-stakeholder recommender systems, provider-fairness interventions that primarily regulate overall exposure often overlook how different user groups historically prefer different provider groups, which results in recommendation distributions that misalign audience allocation and can introduce new forms of disparity. Preference distribution-aware re-ranking can enable provider-fair visibility while preserving this cross-group preference structure, by aligning recommendation shares to the preference distribution observed in the data, as shown in this study.

Context and motivation

Recommender systems shape not only what users consume, but also which providers receive attention and economic opportunity. When item catalogs and user activity are unevenly distributed across demographic groups, standard learning-to-rank or collaborative filtering pipelines can amplify majority signals, making already-dominant provider groups even more visible.

Provider fairness is often framed as a visibility or exposure allocation problem: ensuring that provider groups receive a “reasonable” share of recommendation opportunities. Conceptually, this is necessary but not sufficient in settings where preferences are structured. User groups may systematically differ in which providers they engage with (e.g., due to geography, language, or cultural familiarity). If fairness mechanisms ignore this structure, they may satisfy global exposure targets while reallocating attention in ways that distort who is being matched to whom—potentially harming users (through less preference-aligned lists) and providers (through mismatched audiences rather than meaningful reach).

The central motivation here is that fairness in recommendations is inherently relational: it concerns not only how much attention a provider group receives, but also from which parts of the user population that attention originates.

In a study, in cooperation with Elizabeth Gómez, David Contreras, and Maria Salamó, and published in the International Journal of Information Management Data Insights, we introduce preference distribution-aware provider fairness and a re-ranking approach (DAP-fair) that operationalizes it.

The gap we address is the mismatch between (i) provider-fairness methods that control aggregate provider exposure and (ii) the empirical reality that user preferences over providers are not uniform across user segments. We ask: if we want provider fairness, how can we achieve it while still respecting the observed distribution of which user groups tend to engage with which provider groups?

High-level solution overview

We propose to treat the observed user-group → provider-group interaction pattern as a target distribution that fairness-aware recommendations should preserve. Instead of only ensuring that provider groups appear in recommendation lists according to a global quota, we aim to ensure that each user group receives recommendations across provider groups in proportions that reflect how that user group historically expressed preferences, while still improving visibility for underrepresented provider groups.

To make this actionable, we implement a post-processing strategy: starting from any base recommender’s candidate list, we re-rank items so that the final top-k recommendations collectively match the desired preference distribution across user–provider group pairs, subject to a constraint that limits how much relevance we are willing to trade for distributional alignment.

How it works

The figure summarizes the two-step logic: we first organize candidate recommendations into user–provider “buckets” reflecting group pairs, and then we select items under distributional constraints, relaxing constraints in stages to ensure feasibility.

Preference distribution as a two-sided fairness target

A core methodological choice is to define what “distribution-aware” means using the training data itself. We estimate how user demographic groups distribute their expressed preferences across provider demographic groups, producing a target allocation over user–provider group pairs.

This choice addresses a key problem: global provider quotas alone cannot distinguish between useful reach (being shown to user segments that historically engage with the provider group) and mis-targeted reach (being shown broadly without regard to preference patterns). The abstraction here is that past interactions encode a meaningful, group-level preference structure that fairness interventions should preserve unless there is a principled reason to change it.

Bucketizing candidate recommendations by user–provider group pairs

To enforce a two-sided target, we need a controllable “inventory” of candidate items associated with each user–provider group pair. We therefore partition the base recommender’s candidate recommendations into buckets indexed by (user group, provider group).

This mechanism matters because it converts a global re-ranking problem into a structured allocation problem: we can reason about how many items to draw from each bucket when constructing final recommendation lists. Without bucketization, distributional constraints across pairs are difficult to enforce consistently, especially when multiple user groups and provider groups interact.

Constrained selection that balances distributional alignment and relevance loss

Re-ranking inevitably introduces a trade-off: we may need to pick an item that is slightly less relevant for a user in order to satisfy a distributional requirement. DAP-fair handles this by associating each potential swap or inclusion with a notion of acceptable loss and enforcing a tunable tolerance.

Conceptually, this introduces a control knob that makes the fairness–utility tension explicit and adjustable. The key idea is not to maximize fairness at any cost, but to align recommendations with the target preference distribution while maintaining relevance within an acceptable degradation envelope.

Relaxation to ensure feasible top-k lists

In practice, distributional targets may be infeasible for some users if the candidate pool lacks enough items in certain buckets (for example, when minority provider groups are scarce or not surfaced by the base recommender). DAP-fair therefore relaxes constraints in phases: it prioritizes satisfying the most distribution-sensitive constraints first and gradually broadens the eligible selection space to complete the top-k list.

This matters because it makes the approach robust: instead of failing when strict constraints cannot be met, the method degrades gracefully toward a relevance-first completion strategy while still capturing as much of the desired distributional structure as the candidate set allows.

Findings and insights

Empirically, the study focuses on geographic demographic groups, using continent-of-origin groupings for both users and providers, and evaluates on two real-world domains (books and online courses). The analysis first shows that standard recommendation models can substantially deviate from the observed preference distribution, often reinforcing the dominant provider group and reshaping who gets recommended what across user segments.

A key insight is that provider-fairness methods that only correct provider-side exposure can still distort the user-to-provider allocation: they may “fix” aggregate visibility but do so by reallocating recommendations across user groups in ways that do not reflect the historical preference distribution. This supports the paper’s central claim that fairness should be assessed not only at the provider margin, but also at the user–provider interaction level.

When applying DAP-fair, disparities with respect to the target preference distribution are qualitatively reduced across underlying recommenders and across both datasets, while effectiveness remains largely stable. The results also highlight an important practical condition: the ability to meet distributional targets depends on the diversity of the candidate set. When the base recommender does not surface enough items in certain user–provider buckets, residual bucket-level disparities remain even after re-ranking—indicating that post-processing can be limited by upstream retrieval and modeling choices.

Finally, the loss-tolerance parameter behaves as a meaningful control: modest tolerance can already yield substantial distributional corrections, while higher tolerance pushes closer to the target distribution but can introduce a larger (still generally limited) relevance trade-off. Conceptually, this reinforces the value of making the fairness–utility balance an explicit design decision rather than an implicit side effect.

Conclusions

This work’s main contribution is a shift in how we operationalize provider fairness: from a one-sided exposure allocation to a two-sided, distribution-aware objective that preserves how different user segments relate to different provider groups. By framing the target as an empirically grounded preference distribution and enforcing it through bucket-based re-ranking with controlled relevance loss, we obtain a practical mechanism for fairer exposure that remains aligned with observed preference structure.

Several research directions follow naturally from this framing. First, group-level distribution awareness can be extended toward finer-grained, user-level calibration, where we aim to preserve individual preference profiles over provider attributes while still meeting provider-fairness constraints. Second, the approach can be generalized beyond geography to other sensitive attributes (or to multi-attribute intersections), raising questions about which distributions should be preserved and how to handle conflicting targets. Third, the dependence on candidate diversity suggests a deeper integration between retrieval, ranking, and fairness constraints, where upstream modeling is encouraged to surface a feasible set for downstream distribution-aware re-ranking.