Abstract
Modern society makes extensive use of automated algorithmic decisions, fueled by advances in artificial intelligence. However, since these systems are not perfect, questions about fairness are increasingly investigated in the literature. In particular, many authors take a Rawlsian approach to algorithmic fairness. Based on complications with this approach identified in the literature, this article discusses how Rawls’s theory in general, and especially the difference principle, should reasonably be applied to algorithmic fairness decisions. It is observed that proposals to achieve Rawlsian algorithmic fairness often aim to uphold the difference principle in the individual situations where automated decision-making occurs. However, the Rawlsian difference principle applies to society at large and does not aggregate in such a way that upholding it in constituent situations also upholds it in the aggregate. But such aggregation is a hidden premise of many proposals in the literature and its falsity explains many complications encountered.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
In the past few decades, increased use of automated decision-making in general and AI in particular has drawn attention to many philosophical problems, such as the implications of opaque ‘black-box’ like decisions (see, e.g. Fleischmann & Wallace, 2005; Castelvecchi, 2016; Holm, 2019; Zerilli et al., 2019), the possibility of a responsibility gap when machines make decisions autonomously (see, e.g. Matthias, 2004; Johnson, 2015; de Laat, 2018), and the risks of AI—ranging from the mundane to the existential (see, e.g. Müller & Bostrom, 2014; Ord, 2020; Müller & Cannon, 2022).
Among the mundane—but far from unimportant—risks of automated decision-making is unfairness. Over and over again, examples of biased, prejudiced, and discriminatory behaviors have turned up in automated systems. For example, the literature review by Köchling & Wehner (2020) shows that many recruitment systems exhibit bias with respect to gender and ethnicity. Obermeyer et al. (2019) show that (US) systems for prediction of medical needs exhibit large racial biases, since the algorithm uses health care costs as a proxy for illness, but the costs in the training data do not accurately reflect the actual medical needs. Similarly, a literature review by Cavazos et al. (2020) shows that nearly all face recognition algorithms tested are racially biased in their performance. Interestingly, some of these algorithms are better at recognizing faces from their own geographic origins, illustrating the point made by Mittelstadt et al. (2016) that algorithms reflect the values of their designers. Koenecke et al. (2020) find similar results in their analysis of speech recognition systems, and Hankerson et al. (2016) catalog many other similar cases, including a soap dispenser which would dispense soap onto hands of light but not dark color.
As a result, technical AI experts and philosophers alike have spent considerable efforts to understand and mitigate these problems. Interestingly, the more automated decision-support systems are used, and the greater their level of automation, the more can be gained—in terms of better outcomes—from improving such automated procedures (see, e.g., Lee et al., 2019). It is this intimate connection between procedure and outcome which makes the field known as algorithmic fairness concerned with both procedural justice (the process of developing and deploying decision support systems must not be biased against some groups or individuals) and substantive justice (the outcomes of decisions made by or with the help of those systems must not unjustly disadvantage some groups or individuals). A technically oriented review article of the field is Chouldechova & Roth (2020), a more philosophically oriented one is Fazelpour & Danks (2021).
One strand of thought which has received much attention is the prospects for Rawlsian algorithmic fairness, i.e., the application of Rawls’s seminal 1971 work A Theory of Justice to these questions.Footnote 1 Procaccia (2019) calls Rawls “AI’s favorite philosopher”, and it has been argued that the goal to get rid of bias or discrimination “is rooted in Rawlsian ethics” (Procaccia, 2020). This Rawlsian approach to algorithmic fairness is further introduced in Section 2. But while popular, it has been criticized for allowing “loopholes” (Jørgensen & Søgaard, 2023). In previous work (Franke, 2021) we identified a number of complications with Rawlsian algorithmic fairness, but were unable to provide any unified explanation for why these complications occur. The purpose of this article is to offer such an explanation. We delimit ourselves to the assumption that Rawls is broadly right and ask the questions: What does this mean for algorithmic fairness? How should Rawlsian thought be applied in this area? In particular, we focus on the difference principle.
More precisely, we identify what seems to be a root-cause of many of the complications identified: Proposals to achieve Rawlsian algorithmic fairness in the literature (see, e.g., Heidari et al., 2018) often aim to uphold the difference principle in the particular decision-situations where automated decision-making occurs. However, the Rawlsian difference principle applies to society at large—an aggregation of many such situations—and as argued in Section 3, the difference principle does not aggregate in such a way that upholding it in constituent situations also upholds it in the aggregate. But such aggregation is a hidden premise of many proposals for Rawlsian algorithmic fairness in the literature. Having made this point about the missing aggregation property, we briefly discuss in Section 4 how the difference principle could instead be upheld. Finally, Section 5 offers some concluding remarks.
2 Algorithmic Fairness and the Rawlsian Approach
The Rawlsian approach to algorithmic fairness is based on Rawls’s two principles of justice for institutions (Rawls, 1999, p. 266):
-
First principle: Each person is to have an equal right to the most extensive total system of equal basic liberties compatible with a similar system of liberty for all.
-
Second principle: Social and economic inequalities are to be arranged so that they are both:
-
(a)
to the greatest benefit of the least advantaged, consistent with the just savings principle, and
-
(b)
attached to offices and positions open to all under conditions of fair equality of opportunity.
-
(a)
The principles of justice are lexically ordered, so that the first principle takes precedence. Thus, for example, if a distribution of resources in accordance with part (a) of the second principle, i.e., the difference principle, would somehow lead to citizens not being able to see themselves as free and equal anymore, according to the first principle, this distribution is precluded.
Rawls derives these principles through the thought experiment of the original position (Rawls, 1999, pp. 102–168). Here, people have convened to deliberate how to organize the basic structure of society. The distinguishing feature of the original position is that the parties deliberate behind a veil of ignorance (Rawls, 1999, pp. 118–123), designed to ensure impartiality. Thus, the parties do not know their personal characteristics, their social status, their generation, nor any probabilities for belonging to particular groups, and thus they cannot tailor the organization of society to their own narrow self-interest, only to a broader common interest. The two principles, Rawls argues, are what would emerge from such an hypothetical deliberation. As mentioned above, Rawlsian thought—especially fair equality of opportunity, but also the difference principle—is commonly applied to algorithmic fairness:
For example, the individual notions of algorithmic fairness proposed in work by Dwork et al. (2012) and Joseph et al. (2016)Footnote 2 have been described by Lepri et al. (2018) “as a mathematical formalization of the Rawlsian principle of ‘fair equality of opportunity’” from part (b) of the second principle.
Furthermore, Lee et al. (2021, Table 4) identify the same concept—fair equality of opportunity—as the philosophical origin of no less than six different statistical notions of algorithmic fairness, i.e., notions which typically require parity between algorithmic performance measures for different groups: (1) False Negative Rate (FNR) parity (Hardt et al., 2016), (2) False Positive Rate (FPR) parity (Chouldechova, 2017), (3) equal odds, i.e., simultaneous True Positive Rate (TPR) and True Negative Rate (TNR) parities (Hardt et al., 2016), and (4) Positive Predictive Value (PPV) parity (Chouldechova, 2017) for binary classification problems all originate from the Rawlsian fair equality of opportunity concept. So do (5) positive and (6) negative class balance (Kleinberg et al., 2017), two notions applicable when a binary classification is accomplished through some intermediate scoring mechanism. Then, positive (negative) class balance requires that the average score assigned to members of one group who belong to the positive (negative) class should be the same as the average score assigned to members of another group who belong to the positive (negative) class. For a more thorough discussion of Rawlsian equality of opportunity in machine learning, see Heidari et al. (2019).
A few more concrete examples of how Rawlsian principles are applied in practice are the following: Leben (2017), who develops an algorithm for trolley-style situations based on the Rawlsian original position and the maximin rule. The idea is to have a (self-driving) vehicle first estimate survival probabilities for different courses of actions, then find out which action satisfies the maximin rule (for a critical discussion of this proposal, see Keeling, 2017).
Peng (2020) develops a process to de-bias machine learning algorithms so that they favor those who have been historically disadvantaged. The argument is that much of what is distributed by algorithms are not positions where part b of the second principle, fair equality of opportunity, applies, but rather primary goods, where part a, the difference principle, applies.Footnote 3 Thus, under this proposal, algorithms would be constructed so that they compensate for historical disadvantages by applying the difference principle whenever their outcomes have distributional effects.
Shah et al. (2021) propose and implement a Rawls classifier which can be applied to any black-box deep learning model to minimize the error rate on the worst-off sensitive sub-population. Technically, this allows existing models to be modified in the spirit of the difference principle.
Zhu et al. (2021) develop a way to prevent unfairness in recommendation systems, where new items, lacking history, may get be recommended less often than deserved.
Heidari et al. (2018) propose a technical mechanism to include fairness criteria inspired by Rawls as constraints in the optimization problems solved in machine learning training, thus guaranteeing certain corresponding properties of the resulting trained models.
What these concrete proposals have in common is that they involve technical mechanisms to include fairness criteria as constraints on how programmers should construct algorithms, or as constraints on optimization problems solved in machine learning training, thus guaranteeing or promoting certain corresponding properties of the resulting systems. In the following section, we critically discuss a hidden premise of such proposals which aim to uphold the difference principle.
3 The Difference Principle and Aggregation of Situations
The difference principle mandates that social and economic inequalities are to be arranged so that they are to the greatest benefit of the least advantaged. As we saw in the previous section, within the field of Rawlsian algorithmic fairness, this is typically interpreted as imposing a constraint on the workings of individual decision support systems: they should respect something like the risk-averse maximin rule (Leben, 2017; Peng, 2020; Shah et al., 2021; Zhu et al., 2021).
As a concrete example, consider Heidari et al. (2018), who develop fairness constraints on machine learning in a paper titled “Fairness behind a veil of ignorance: A welfare analysis for automated decision making”:
John Rawls proposes the concept of veil of ignorance as the ideal condition/mental state under which a policy maker can select the fairest among a number of political alternatives. He suggests that the policy maker performs the following thought experiment: imagine him/herself as an individual who knows nothing about the particular position they will be born in within the society, and is tasked with selecting the most just among a set of alternatives. In this hypothetical original/ex-ante position, if the individual is rational, they would aim to minimize risk and insure against unlucky events in which they turn out to assume the position of a low-benefit individual. [...] Our main conceptual contribution is to characterize fairness in the context of algorithmic decision making through the Rawlsian theory of justice: our proposal is for the ML expert wishing to train a fair decision making model (e.g. to decide whether salary predictions are to be made using a neural network or a decision tree) to perform the aforementioned thought experiment [...] To formalize the above, our core idea consists of comparing the expected utility a randomly chosen, risk-averse subject of algorithmic decision making receives under different predictive models. [...] Furthermore and from a computational perspective, our welfare-based measures of fairness are more convenient to work with due to their convex formulation. This allows us to integrate them as a constraint into any convex loss minimization pipeline, and solve the resulting problem efficiently and exactly. (Heidari et al., 2018, emphasis in original. The passage quoted is from the version published by ACM—the version at proceedings.neurips.cc is slightly differently worded.)
In summary, Heidari et al. (2018) thus argue the following: to achieve Rawlsian algorithmic fairness, the difference principle should be upheld in individual decision support systems. To facilitate this, their technical contribution makes it easy for ML experts to integrate convex fairness constraints into whatever systems they develop (such as for salary prediction), so that these systems will uphold the difference principleFootnote 4 with respect to their scope of operation. As more automated decision-making is used in society, if these systems each uphold the difference principle in each particular situation, the difference principle will thus, presumably, be upheld at the aggregate level. However, thus articulated, we see that this position depends on a hidden premise concerning aggregation properties of the difference principle. We now proceed to investigate this premise in greater detail. (For clarity, it should be noted that we do not claim that all theories of algorithmic fairness depend on such aggregation properties, nor that Rawls’s original application of the difference principle does—quite the contrary, as discussed in Section 3.3—only that some applications of the difference principle in the algorithmic fairness literature depend on such aggregation properties.)
3.1 Strong Aggregation
In a strong version, the hidden premise can be articulated as follows:
-
Strong aggregation: The difference principle is upheld at the aggregated level if the difference principle is upheld in the constituent situations.
We first observe that there are indeed circumstances where such aggregation properties hold. For example, egalitarian distributions work like that—if everyone first gets an equal share of something, and then gets another equal share of something else, their aggregated shares will also be equal. Similarly, entitlement theories of distributions, such as the one Nozick (1974, pp. 150–153) famously developed as a contrast to Rawls, also exhibit such an aggregation property—if someone first has acquired something justly and then someone else acquires something else justly, then the resulting aggregate is also just. Importantly, it also seems that Rawls’s first principle—the liberty principle—exhibits such aggregation. If the basic liberties of each person are respected in each situation, then they are also respected in the aggregate of these situations. Since “basic liberties can be restricted only for the sake of liberty” (Rawls, 1999, p. 266), it is for example not the case that in the aggregate of many situations, in each of which liberty is respected, we could observe some failure in another property, such as social and economic inequalities, which would be a legitimate reason to restrict liberty in any of the constituent situations. The first principle has lexical precedence over the second one.
Observe that the aggregation property we are investigating is about sufficiency: if some principle is upheld in the constituent situations, it is also upheld in the aggregate. It is not about necessity; only if. The examples above differ in this respect. In the egalitarian case, egalitarianism in the constituent situations is a sufficient but not a necessary condition for the aggregate to be egalitarian, since inequalities may even out so that aggregate equality is achieved without equality in each constituent situation. In the entitlement case, and in the case of Rawls’s first principle, however, justice in each constituent situation is both sufficient and necessary for the aggregate to be just.
However, the difference principle does not have this aggregation property. Strong aggregation is false, as can be proven by a simple counterexample:
Example 1
Let the goods allocated to individual i be denoted by the ith component of a distribution vector (so that (4,3) means that individual 1 gets 4 and individual 2 gets 3), let \(x \prec _{\textrm{DP}} y\) denote that y is preferred to x by the difference principle, and let aggregation of situations be component-wise addition (so that if an individual gets 1 in one situation and 2 in another situation, this individual gets \(1+2=3\) in the aggregation of these situations). Then we have the following counterexample to Strong aggregation:
In the first situation—the first row—we compare two possible distributions (which can be seen as the foreseeable results of two different systems being designed). To the left, there is a distribution where individual 1 gets 1, and individual 2 gets 2. To the right, there is a distribution where both individuals get 1. Now, the additional goods to individual 2 in the left-hand distribution do not benefit individual 1, who is the least advantaged under the left-hand distribution. More precisely, the greater inequality under the left-hand distribution does not make the least advantaged better off than under the egalitarian right-hand distribution—individual 1 still gets only 1. Thus, the right-hand distribution is preferred by the difference principle.
Proceeding to the second situation—the second row—we again compare two possible distributions between the two individuals (or groups). Again, the left-hand distribution is an unequal one, this time benefiting individual 1. But as before, the additional goods to individual 1 do not benefit individual 2, who is the least advantaged under the left-hand distribution, so the right-hand distribution is preferred here as well.
However, it is also possible to take a step back and consider the result of these preferences from the aggregated perspective. What is the overall distribution resulting from the situations? Summing the goods obtained by individual 1 in the distributions preferred by the difference principle in the first and second situations, we get 2, and the goods obtained by individual 2 also sum to 2. But summing the goods attained by the two individuals in the distributions not preferred by the difference principle, it turns out that both of them get more; individual 1 gets 4 and individual 2 gets 3. Moreover, considered as such an aggregate, this unequal left-hand distribution benefits individual 2, who is the least advantaged under this distribution. For individual 2 gets 3 under the left-hand distribution compared to only 2 under the right-hand distribution. Thus, the aggregated left-hand distribution is preferred by the difference principle, even though if considered one situation at a time, the right-hand distributions are preferred.
3.2 Moderate Aggregation
However, even though Strong aggregation is false, proposals for Rawlsian algorithmic fairness that aim to uphold the difference principle in each situation do not need Strong aggregation to be plausible. A weaker version may be sufficient:
-
Moderate aggregation: The difference principle is usually upheld at the aggregated level if the difference principle is upheld in the constituent situations.
Moderate aggregation is an empirical claim, and we cannot conclusively prove or disprove it without empirical investigation. However, it is possible to reason a bit about its plausibility. Consider another example, a probabilistic one:
Example 2
Let the goods allocated to an individual in a situation i be determined by a lottery \(L_i(x_1, p_1; \ldots ; x_n, p_n)\), such that \(\sum _{j=1}^n p_j=1\), where the individual receives each \(x_j\) with probability \(p_j\). When applied to a population of many individuals, such a lottery generates a distribution of goods over the population. Two such distributions (which can again be seen as the foreseeable results of two different systems being designed) can be compared under the difference principle.Footnote 5 In such a situation, comparing population level distributions generated by two different lotteries, we have for example:
To make the numbers concrete, in a population of 1 000, on average one fifth—200 people—would receive 1, three fifths—600 people—would receive 2, and one fifth—200 people—would receive 5 under the left-hand distribution. By contrast, under the right-hand distribution, on average one fifth—200 people—would receive 1, three fifths—600 people—would receive 2, and one fifth—200 people—would receive 3. Now, the least advantaged fifth (receiving 1) are not better off in the left-hand lottery where the most advantaged fifth receives more (5) than in the right-hand lottery where the most advantaged fifth receives less (3), so the greater inequality to the left cannot be motivated by benefiting the least advantaged. Thus, the right-hand lottery is preferred by the difference principle. The fact that the arithmetic mean of the left-hand lottery (2.4) is greater than that of the right-hand lottery (2.0) does not matter from the point of view of the least advantaged. Now, instead consider an aggregate of ten independent such lotteries. It is plausible to hold the following:
Just as in Example 1, the aggregated left-hand distribution is preferred by the difference principle, even though if considered one situation at a time, the right-hand distributions are preferred. For the intuition behind this claim, consider the following simulated distributions over outcomes in Fig. 1. Not only is the average outcome for the \(L_1\)s (close to \(10 \cdot 2.4\)) better than for the \(L_2\)s (close to \(10 \cdot 2.0\)), but—and this is what matters to the difference principle—the least advantaged are better off. The number of people in a population of 1 000 ending up with the lowermost outcomes (14, 15, 16, 17) is smaller for the aggregation of \(L_1\)s than for the aggregation of \(L_2\)s. The difference may not be great, but the aggregation of \(L_1\)s seems better for the least advantaged.
Though the numbers chosen in Example 2 are arbitrary, and the exact outcomes of the simulations depicted are subject to chance (another run would yield somewhat different distributions) the overall tendency is, of course, not a coincidence. It is well known that for sums of independent stochastic variables \(X_i\) with the same expected value \(\mu \) and standard deviation \(\sigma \), the expected value E and standard deviation D are as follows:
The important insight here is that whereas the expected value grows with the number of variables summed, the standard deviation grows only with the square root of the number of variables summed, i.e., slower. Since the standard deviation is a measure of statistical dispersion, this means that as more terms are summed, the distribution becomes relatively more concentrated around the expected value for each additional term.
Using, as is common, the standard deviation as a measure of statistical dispersion, the two aggregated lotteries can be roughly characterized by their expected values and standard deviations. A ‘typical’ outcome of an aggregated lottery is its expected value plus/minus a few standard deviations. Such a characterization yields an explanation of why \(L_2\) is preferred to \(L_1\) as a single instance, but the aggregate of \(L_2\)s is not preferred to the aggregate of \(L_1\)s:
In particular, if we consider a typical bad outcome to be the expected value minus one standard deviation, then in the single instance \(L_2\) fares better than \(L_1\) (\(2-0.63=1.37 > 2.4-1.36 = 1.04\)), but in the aggregate, the \(L_2\)s fare worse than the \(L_1\)s (\(20 - 2 = 18 < 24 - 4.29 = 19.71 \)).
Using expected value and standard deviation (i.e., the two first moments of the probability distribution) is a common and often useful simplification, even though it cannot formally reflect all possible utility functions (see, e.g, Varian, 1992, p. 371). To understand why these two go a long way towards capturing risk-aversion such as the Rawlsian maximin principle, consider the Chebyshev inequality:
In words, the Chebyshev inequality offers an upper bound on how much probability mass can be outside an interval of 2k standard deviations, centered at the expected value \(\mu \). An interval of two (\(k=1\)) standard deviations includes at least half of all outcomes; an interval of four (\(k=2\)) standard deviations includes at least three quarters of all outcomes, etc. Furthermore, for most distributions, this is a gross underestimation of how much probability mass actually falls within the interval—equality is possible only for two-point distributions.
At this stage, however, it is reasonable to ask whether it is appropriate to use the standard deviation as our measure of dispersion, thus implicitly defining the least advantaged as those who fall at one, two, or some other multiple k standard deviations from the arithmetic mean. In particular, why not use the minimum instead, as the term ‘maximin’ seems to suggest? For example, revisiting Example 2, this worst possible case is 1 in a single instance of \(L_1\) and \(L_2\) alike, and it is \(10 \cdot 1\) in ten aggregated instance of \(L_1\) and \(L_2\) alike.
But Rawls’s least advantaged is not “a low-benefit individual” (in the phrase of Heidari et al., 2018) but rather a group, viz. those who are the least fortunate with respect to (i) family and class, (ii) natural endowments, and (iii) fortune and luck, but “all within the normal range” (Rawls, 1999, p. 83), i.e., removing the most extreme cases from consideration. Why is this so? An important part of the answer is that Rawls (1999, p. 84) does not want to “distract our moral perception by leading us to think of persons distant from us whose fate arouses pity and anxiety”. Exactly how large this Rawlsian ‘normal range’ should be is of course debatable, but to construe it as some multiple k standard deviations from the arithmetic mean seems natural and in line with Rawls’s theory.
The probability of anyone receiving the worst possible outcome (1) in a single instance of \(L_1\) or \(L_2\) is \(\frac{1}{5}\), i.e., a considerable chance. Thus, it seems reasonable to assess this outcome as being within the Rawlsian ‘normal range’. Since this outcome is the same in the two lotteries, including it in the assessment underpins the judgment that \(L_2\) is preferable under the difference principle—the greater inequality of \(L_1\) is not to the benefit of the least advantaged.
However, the probability of anyone receiving the worst possible outcome (10) when aggregating 10 independent instances of \(L_1\) or \(L_2\) is \(\left( \frac{1}{5} \right) ^{10} \approx 10^{-7}\), i.e., one in ten million.Footnote 6 Thus, it seems reasonable to assess this outcome as not being within the Rawlsian ‘normal range’ and its persistence in both lotteries is not an argument against the difference principle preferring the aggregate of ten \(L_1\)s to the aggregate of ten \(L_2\)s.
Note that in Example 2, the lotteries are independent, so that the expressions for variance and standard deviation given in Eq. (1) hold. With many independent lotteries, each individual outcome becomes less important, because they tend to even out in the long run. However, it is instructive to also consider a situation with correlations. Recall that when adding stochastic variables in the general case, not only variances but also covariances must be added:
The standard deviation, as usual, is found by taking the square root of the variance.
Example 3
Let individual instances of \(L_1\) and \(L_2\) be as before, but drop the assumption of independence between situations, and instead let lotteries \(L_{1,i}\) and \(L_{1,j}\) have non-zero covariance \(C_1\) and lotteries \(L_{2,i}\) and \(L_{2,j}\) have non-zero covariance \(C_2\). Then, as before, \(L_1 \prec _{\textrm{DP}} L_2\) as individual instances, but what happens in the aggregate depends on \(C_1\) and \(C_2\). For simplicity, consider aggregating just two lotteries. If, as before, we use the expected value minus one standard deviation as our guide, then we must compare \(2.4 + 2.4 - \sqrt{1.84 + 1.84 + 2C_1}\) with \(2 + 2 - \sqrt{0.4 + 0.4 + 2C_2}\) and find the greatest of the two. The outcome hinges on the particular values of \(C_1\) and \(C_2\). Note that a negative covariance makes the variance of the aggregate smaller (similar to Example 1), whereas a positive covariance makes the variance of the aggregate greater.
Example 3 illustrates the importance of covariance. Though its exact impact clearly depends on how the problem is formalized and mathematically modeled, the fact that combining situations with negative covariance decreases statistical dispersion (to zero, in the limit) shows its importance.
Recall that we are assessing the claim made in Moderate aggregation, that upholding the difference principle in constituent situations is usually sufficient to uphold it in the aggregate. Though this is an empirical claim, the evidence from the examples given weigh against it. More precisely: in cases with many independent (or at least uncorrelated) situations, applying the difference principle in each situation and selecting a lottery with lower expected value because of its smaller statistical dispersion may not uphold the difference principle in the aggregate, because the aggregate expected value grows faster than the aggregate standard deviation. In cases with correlated situations, applying the difference principle in each situation and selecting a lottery with lower expected value because of its smaller statistical dispersion forfeits not only the greater expected value, but also the possibility to use a greater statistical dispersion as a counterbalance to another great statistical dispersion in another situation. Thus, we reject Moderate aggregation.
3.3 Additional Perspectives on Aggregation
The insight that the difference principle lacks an aggregation property should not be new. Going back to Rawls, he emphasizes that the difference principle should not be applied on a case-by-case basis, but on the basic structure of society:
The situation where someone is considering how to allocate certain commodities to needy persons who are known to him is not within the scope of the principles. They are meant to regulate basic institutional arrangements. We must not assume that there is much similarity from the standpoint of justice between an administrative allotment of goods to specific persons and the appropriate design of society. Our common sense intuitions for the former may be a poor guide to the latter. (Rawls, 1999, p. 56)
Rawls clearly envisions upholding the difference principle in other ways than by upholding it in particular situations and aggregate from there.
That the difference principle lacks aggregation properties is also at least implicit in the observation made by Nozick (1974, pp. 160–164) about the difficulty of upholding what he calls patterned principles of distributive justice. Patterns such as a distribution to the benefit of the least advantaged always risk being upset by voluntary actions undertaken in subsequent situations, and may thus require constant redistribution. The normative implications of this are out of scope here—as mentioned in the introduction, we are investigating algorithmic fairness on the assumption that Rawls is broadly right—but we still note Nozick’s descriptive observation: patterned principles such as the difference principle are not in general self-sustaining in the sense that they are upheld in the aggregate if upheld in every individual instance.
An observation explicitly about the difference principle’s lack of aggregation properties is made by Schmidtz :
Second, although the principle may apply to many “abstract” possibilities, it does not apply to case-by-case redistribution. The difference principle applies only to a choice of society’s basic structure. Is this restriction of scope ad hoc, as Rawls’s critics often say? No! Why not? Because applying the difference principle to every decision, as if Joe should never earn or spend a dollar unless he can prove that doing so is to the greatest benefit of the least advantaged, would cripple the economy, hurting everyone, including the least advantaged. The difference principle rules out institutions that work to the detriment of the least advantaged, including ones that overzealously apply the difference principle to the detriment of the least advantaged. There is nothing ad hoc about this constraint. It derives straightforwardly from the difference principle itself. (Schmidtz, 2006, pp. 189–190, emphasis in original).
Since the examples have led us to consider aggregates of stochastic outcomes, it is instructive to also relate to financial economics, where structurally similar problems have long been studied both theoretically and empirically. Though it is out of scope to review all the different theories thoroughly, it is highly relevant to consider the following short description from a classic textbook:
This analysis involves considerations of general equilibrium since the value of a risky asset inherently depends on the presence or absence of other risky assets which serve as complements or substitutes with the asset in question. Therefore, in most models of asset pricing, the value of an asset ends up depending on how it covaries with other assets. What is surprising is how generally this insight emerges in models that are seemingly very different. (Varian, 1992, p. 370, emphasis in original)
To spell out what is being said, note that general equilibrium means precisely that each situation/asset cannot be assessed on its own, but that the entire aggregate (investment portfolio) must be considered together. The aggregation property does not hold. Thus, it is tempting to paraphrase Varian and say that how the difference principle, in the aggregate, is best served in each particular risky situation requires an holistic approach, since the role of each risky situation depends on the presence or absence of other risky situations.
Though we have rejected the aggregation property with respect to the difference principle, it does not follow that there is no such aggregation property in broader Rawlsian algorithmic fairness, which includes both principles of justice. Recall that the principles are lexically ordered and that the first principle—the liberty principle—takes precedence. As pointed out above, this principle exhibits a strong aggregation property: if the equal basic liberties of each person are not violated in any one situation, then the aggregate of these situations also does not violate these liberties. Indeed, just like in the entitlement theory mentioned above, upholding Rawls’s first principle in the constituent situations is both sufficient and necessary for upholding the first principle in the aggregate.
It is also important to bear in mind that the lack of an aggregation property matters only when situations can indeed be aggregated, so that a bad outcome in one situation can be (over-)compensated for by a good outcome in another. For example, not being interviewed for one job can be compensated for by being interviewed for another one, not being granted a loan by a bank can be compensated for by being granted one from a competitor, and getting a bad suggestion for a book to read can can be compensated for by getting a good suggestion for another one. But if situations cannot be aggregated, perhaps because they only occur once or because they represent truly incommensurable goods (for an introduction to incommensurable values, see Hsieh & Andersson 2021; for a classic defense of ‘spheres of justice’ between which redistribution does not make sense, see Walzer 1983) then the lack of an aggregation property does not matter. If the difference principle is to be applied in such situations, it has to be applied to them separately, because there is no (reasonable) aggregate.
4 Upholding the Difference Principle in other Ways
In the previous section, we investigated and rejected the aggregation property of the difference principle: it is not the case that if the difference principle is upheld in each particular situation, it is also upheld at the aggregate level. Upholding it in each particular situation is not sufficient to also uphold it at the aggregate level. But we could also ask whether it is necessary. Given the evidence, it seems that a better way to uphold the difference principle would be to redistribute goods ex post, when the picture of the aggregate emerges, rather than second guessing it ex ante in all the constituent situations.Footnote 7
Hedden (2021), in his critique of statistical notions of algorithmic fairness makes a similar remark. Though it does not explicitly pertain to Rawls or the difference principle, the general point seems equally valid in our context:
The conceptual point is this: When a predictive algorithm is used to make decisions with distributional consequences or other effects that we deem unfair or unjust, this does not mean that the algorithm itself is unfair or biased against individuals in virtue of their group membership. The unfairness or bias could instead lie elsewhere: with the background conditions of society, with the way decisions are made on the basis of its predictions, and/or with various side effects of the use of that algorithm, such as the exacerbation of harmful stereotypes. The practical point is that, as a result, the best response may sometimes be not to modify the predictive algorithm itself, but to instead intervene elsewhere, by changing the background conditions of society (e.g., through reparations, criminal justice reforms, or changes in the tax code), by modifying how we act on the basis of the algorithm’s predictions (e.g., by adopting different risk thresholds for different groups, above which we deny bail, or reject a loan application, and so on), or by attempting to mitigate the other negative side effects of the algorithm’s use. Hedden (2021)
‘Intervening elsewhere’ is indeed an important part of the Rawlsian toolbox—most often this is probably a much better way to uphold the difference principle than to modify each individual decision-making situation.
5 Conclusions
The Rawlsian approach to algorithmic fairness is a popular one. However, it also comes with complications. The argument presented in Section 3 suggests that the root-cause of many of these difficulties is the hidden premise that the difference principle is upheld at the aggregated level if it is upheld in the constituent situations. The falsity of this premise, we propose, is a good explanation of the complications we identified in previous work (Franke, 2021).
First, attitudes to risk. Only in some situations—typically, irrevocable choices with very high stakes—are there good reasons to adopt the risk-averse maximin rule. In other—more mundane—situations, risk-neutrality or even risk-seeking may be appropriate. This observation suggests that proposals for Rawlsian algorithmic fairness that apply the difference principle in every situation are misguided (Franke, 2021, Section 2). The falsity of the aggregation premise neatly explains what goes wrong in such proposals. If Rawls is right, the difference principle applies to the basic structure of society. The fact that other attitudes to risk are appropriate in particular situations is not a problem, as long as this occurs within this basic structure. The difference principle can be upheld in the aggregate through other mechanisms, such as redistribution.
Second, the scope of stakeholders. Proposals for Rawlsian algorithmic fairness sometimes delimit very narrow sets of stakeholders to be considered (e.g., the people who are classified as false negatives, true positives, true negatives, and false positives, respectively), whereas the set of stakeholders in Rawls’s original position is in fact much broader—in a sense, everyone (Franke, 2021, Section 3). Again, the falsity of the aggregation premise explains where the tension comes from. For a particular constituent situation, the set of stakeholders may indeed be plausibly delimited. But since upholding the difference principle in these situations does not entail upholding it at the aggregated level, such delimited sets of stakeholders appear seriously inadequate from the aggregated point of view. The same explanation pertains to the related complication with defining the least advantaged (Franke, 2021, Section 4).
Third, knowledge about probabilities. Proposals for Rawlsian algorithmic fairness which aim to uphold the difference principle in particular situations face a dilemma with respect to knowledge about probabilities: either (i) disregard relevant statistical information of the kind often considered to be at the core of algorithmic fairness, or (ii) incorporate this information and abandon the Rawlsian veil of ignorance for a (thinner) veil of uncertainty (Franke, 2021, Section 5). As before, the falsity of the aggregation premise explains the origin of this dilemma—it occurs because the difference principle is being applied to particular situations, and the falsity of the aggregation premise suggests that this is misguided.
In this sense, the missing aggregation property sheds explanatory light on the promise and peril of Rawlsian algorithmic fairness.
Having made the observation that the difference principle is sometimes applied to what seems to be the wrong situations, we can also note that this is in line with other comments in the literature. Wong (2019) makes a similar observation, arguing that researchers have taken algorithmic bias seriously but primarily conceptualized it as a technical task, while it should rather, first and foremost, be conceptualized as a political question. If Wong’s observation is right, it is not surprising that there are many proposals in the literature where the difference principle is applied to particular situations, because this is most often where it is technically feasible to apply it. By contrast, applying it to the basic structure of society is beyond what is technically feasible, precisely because of the missing aggregation property.
The conclusion that Rawlsian algorithmic fairness does not require the difference principle to be upheld in each and every situation, but rather in the aggregate, has considerable practical relevance. Both vendors building AI solutions and organizations procuring them are currently making efforts to achieve algorithmic fairness inspired by Rawls. They would be well advised not to proceed on naïve assumptions about how the difference principle aggregates. Rawls’s first principle—the liberty principle—should be upheld in each and every situation, but the difference principle should not. If Rawls is broadly right and the difference principle should be upheld, it should be applied as intended, to the basic structure of society.
Availability of data and material
Not applicable.
Notes
Preparing the first foreign translation, the 1971 text was revised in 1975. This revised edition was published in English in 1999 (Rawls, 1999) and is used in the following.
The first preprint version of the paper by Joseph et al. had Rawls in the title.
Though we may ask if Peng is too quick here, since Rawls, (1999, p. 54) defines primary goods as “rights, liberties, and opportunities, and income and wealth”, to be jointly governed by the second principle.
The difference principle is not explicitly mentioned. However, it seems like the most charitable and reasonable interpretation of the aim to “minimize risk and insure against unlucky events in which they turn out to assume the position of a low-benefit individual”—this is, broadly speaking, what the difference principle does.
It may be held that the difference principle only implies a preference in situations where an unequal distribution is compared to an equal one—“unless there is a distribution that makes both persons better off (limiting ourselves to the two-person case for simplicity), an equal distribution is to be preferred” (Rawls, 1999, pp. 65–66)—and is indeterminate in situations such as the one in the example, where two unequal distributions are compared to each other. However, Rawls does in fact compare different non-equal distributions, distinguishing two cases: “The first case is that in which the expectations of the least advantaged are indeed maximized (subject, of course, to the mentioned constraints). No changes in the expectations of those better off can improve the situation of those worst off. The best arrangement obtains, what I shall call a perfectly just scheme. The second case is that in which the expectations of all those better off at least contribute to the welfare of the more unfortunate. That is, if their expectations were decreased, the prospects of the least advantaged would likewise fall. Yet the maximum is not yet achieved. Even higher expectations for the more advantaged would raise the expectations of those in the lowest position. Such a scheme is, I shall say, just throughout, but not the best just arrangement.” (Rawls, 1999, p. 68). Here, Rawls clearly compares two unequal distributions—a perfectly just scheme and a scheme which is just, but not perfectly so. Furthermore, there is also a clear criterion of preference here: higher expectations for those worst off are better. This warrants the preference expressed in the example.
We asked above why we should not use the minimum of a distribution when considering it under the difference principle. But the low probability of this outcome actually materializing suggests that perhaps the infimum is a better term—it is not materialized in all actual instances of the distribution, and these instances will thus have a greater actual minimum. The infimum, however, remains the greatest lower bound. A maxiinf principle, as opposed to the Rawlsian maximin pcinciple, would be a more likely candidate to satisfy aggregation properties.
This is not to say that redistribution does not come with problems of its own. For example, whereas it is straightforward to imagine economic resources being redistributed from someone who is rich to someone who is poor, it is less straightforward to imagine what kind of redistribution would be appropriate in situations where an automated system fails to recognize voices or faces, or fails to give the most appropriate recommendations for reading or museum visits, or fails to correctly predict criminal recidivism. As Schmidtz points out in a somewhat similar context: “We are not trying to fix an improper distribution of cleft palates. We are trying to fix cleft palates.” (Schmidtz, 2006, p. 219, emphasis in original). These situations are complicated for at least two reasons: First, there is not any finite ‘cake’ to distribute even in the short run—removing goods from someone does not necessarily yield any goods to give to someone else. Second, an important recent finding in the literature on statistical notions of algorithmic fairness is that different prima facie plausible notions of fairness cannot be satisfied simultaneously (Chouldechova, 2017; Kleinberg et al., 2017). Thus, some unfairness will always remain—at least on some notions—whichever algorithm is used and whatever redistribution takes place. However, these complications point to much larger questions, beyond the present investigation, and we leave them unresolved here.
References
Castelvecchi, D. (2016). Can we open the black box of AI? Nature News, 538(7623), 20. https://doi.org/10.1038/538020a
Cavazos, J. G., Phillips, P. J., Castillo, C. D., & O’Toole, A. J. (2020). Accuracy comparison across face recognition algorithms: Where are we on measuring race bias? IEEE Transactions on Biometrics, Behavior, and Identity Science. https://doi.org/10.1109/TBIOM.2020.3027269
Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
Chouldechova, A., & Roth, A. (2020). A snapshot of the frontiers of fairness in machine learning. Communications of the ACM, 63(5), 82–89. https://doi.org/10.1145/3376898
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, Association for Computing Machinery, New York, NY, USA, ITCS ’12, p 214–22https://doi.org/10.1145/2090236.2090255
Fazelpour, S., & Danks, D. (2021). Algorithmic bias: Senses, sources, solutions. Philosophy Compass, 16(8), e12760. https://doi.org/10.1111/phc3.12760
Fleischmann, K. R., & Wallace, W. A. (2005). A covenant with transparency: Opening the black box of models. Communications of the ACM, 48(5), 93–97. https://doi.org/10.1145/1060710.1060715
Franke, U. (2021). Rawls’s Original Position and Algorithmic Fairness. Philosophy & Technology, 34, 1803–1817. https://doi.org/10.1007/s13347-021-00488-x
Hankerson, D., Marshall, A. R., Booker, J., El Mimouni, H., Walker, I., & Rode, J. A. (2016). Does technology have race? In: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, pp 473–486,https://doi.org/10.1145/2851581.2892578
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems, 29.
Hedden, B, (2021), On statistical criteria of algorithmic fairness. Philosophy and Public Affairs, 49(2). https://doi.org/10.1111/papa.12189
Heidari, H., Ferrari, C., Gummadi, K. P., & Krause, A. (2018). Fairness behind a veil of ignorance: A welfare analysis for automated decision making. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp 1273–1283, https://dl.acm.org/doi/abs/10.5555/3326943.3327060
Heidari, H., Loi, M., Gummadi, K. P., & Krause, A. (2019). A moral framework for understanding fair ML through economic models of equality of opportunity. In: Proceedings of the conference on fairness, accountability, and transparency, pp 181–190. https://doi.org/10.1145/3287560.3287584
Holm, E. A. (2019). In defense of the black box. Science, 364(6435), 26–27. https://doi.org/10.1126/science.aax0162
Hsieh, N., & Andersson, H. (2021). Incommensurable Values. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy, Fall (2021st ed.). Metaphysics Research Lab: Stanford University.
Johnson, D. G. (2015). Technology with no human responsibility? Journal of Business Ethics, 127(4), 707–71. https://doi.org/10.1007/s10551-014-2180-1
Jørgensen, A. K., & Søgaard, A. (2023). Rawlsian AI fairness loopholes. AI and Ethics, 3(4), 1185–119. https://doi.org/10.1007/s43681-022-00226-9
Joseph, M., Kearns, M., Morgenstern, J., Neel, S., & Roth, A. (2016). Fair algorithms for infinite and contextual bandits. arXiv:1610.09559, https://arxiv.org/abs/1610.0955
Keeling, G. (2017). Against Leben’s Rawlsian collision algorithm for autonomous vehicles. In: 3rd Conference on Philosophy and Theory of Artificial Intelligence, Springer, pp 259–27https://doi.org/10.1007/978-3-319-96448-5_29
Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In: 8th Innovations in Theoretical Computer Science Conference (ITCS 2017), Schloss Dagstuhl–Leibniz-Zentrum für Informatik, vol 67, p 43.https://doi.org/10.4230/LIPIcs.ITCS.2017.43
Köchling, A., & Wehner, M. C. (2020). Discriminated by an algorithm: A systematic review of discrimination and fairness by algorithmic decision-making in the context of HR recruitment and HR development. Business Research, 13(3), 795–848. https://doi.org/10.1007/s40685-020-00134-w
Koenecke, A., Nam, A., Lake, E., Nudell, J., Quartey, M., Mengesha, Z., Toups, C., Rickford, J. R., Jurafsky, D., & Goel, S. (2020). Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences, 117(14), 7684–768. https://doi.org/10.1073/pnas.1915768117
de Laat, P. B. (2018). Algorithmic decision-making based on machine learning from Big Data: Can transparency restore accountability? Philosophy & Technology, 31(4), 525–54. https://doi.org/10.1007/s13347-017-0293-z
Leben, D. (2017). A Rawlsian algorithm for autonomous vehicles. Ethics and Information Technology, 19(2), 107–11. https://doi.org/10.1007/s10676-017-9419-3
Lee, M. K., Jain, A., Cha, H. J., Ojha, S., & Kusbit, D. (2019). Procedural justice in algorithmic fairness: Leveraging transparency and outcome control for fair algorithmic mediation. Proceedings of the ACM on Human-Computer Interaction 3(CSCW):1–2https://doi.org/10.1145/3359284
Lee, M. S. A., Floridi, L., & Singh, J. (2021). Formalising trade-offs beyond algorithmic fairness: lessons from ethical philosophy and welfare economics. AI and Ethics pp 1–16. https://doi.org/10.1007/s43681-021-00067-y
Lepri, B., Oliver, N., Letouzé, E., Pentland, A., & Vinck, P. (2018). Fair, transparent, and accountable algorithmic decision-making processes: The premise, the proposed solutions, and the open challenges. Philosophy & Technology, 31(4), 611–627. https://doi.org/10.1007/s13347-017-0279-x
Matthias, A. (2004). The responsibility gap: Ascribing responsibility for the actions of learning automata. Ethics and information technology, 6(3), 175–18. https://doi.org/10.1007/s10676-004-3422-1
Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2), 2053951716679679. https://doi.org/10.1177/2053951716679679
Müller, V. C., & Bostrom, N. (2014). Future progress in artificial intelligence: A poll among experts. AI Matters, 1(1), 9–1. https://doi.org/10.1145/2639475.2639478
Müller, V. C., & Cannon, M. (2022). Existential risk from AI and orthogonality: Can we have it both ways? Ratio, 35(1), 25–3. https://doi.org/10.1111/rati.12320
Nozick, R. (1974). Anarchy, State, and Utopia. Basic Books.
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–45. https://doi.org/10.1126/science.aax2342
Ord, T. (2020). The precipice: Existential risk and the future of humanity. Hachette Books
Peng, K. (2020). Affirmative equality: A revised goal of de-bias for artificial intelligence based on difference principle. In: 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE), pp 15–19,https://doi.org/10.1109/ICAICE51518.2020.00009
Procaccia, A. (2019). AI Researchers Are Pushing Bias Out of Algorithms. https://www.bloomberg.com/opinion/articles/2019-03-07/ai-researchers-are-pushing-bias-out-of-algorithms. Accessed June 30, 2021
Procaccia, A. D. (2020). Technical perspective: An answer to fair di vision’s most enigmatic question. Communications of the ACM, 63(4), 118. https://doi.org/10.1145/3382131
Rawls, J. (1999). A theory of Justice. Revised edition. Oxford University Press
Schmidtz, D. (2006). Elements of Justice. Cambridge University Press.
Shah K, Gupta P, Deshpande A, Bhattacharyya C (2021) Rawlsian Fair Adaptation of Deep Learning Classifiers, Association for Computing Machinery, New York, NY, USA, p 936–94https://doi.org/10.1145/3461702.3462592
Varian, H. R. (1992). Microeconomic analysis, 3rd edn. WW Norton
Walzer M (1983) Spheres of justice: A defense of pluralism and equality. Basic Books
Wong PH (2019) Democratizing algorithmic fairness. Philosophy & Technology pp 1–2https://doi.org/10.1007/s13347-019-00355-w
Zerilli, J., Knott, A., Maclaurin, J., & Gavaghan, C. (2019). Transparency in algorithmic and human decision-making: Is there a double standard? Philosophy & Technology, 32(4), 661–68. https://doi.org/10.1007/s13347-018-0330-6
Zhu, Z., Kim, J., Nguyen, T., Fenton, A., & Caverlee, J. (2021). Fairness among New Items in Cold Start Recommender Systems, Association for Computing Machinery, New York, NY, USA, p 767–776. https://doi.org/10.1145/3404835.3462948
Acknowledgements
The author is grateful to Niklas Möller, Katharina Berndt Rasmussen, and Krister Bykvist for insightful comments on earlier drafts.
Funding
Open access funding provided by RISE Research Institutes of Sweden. The author received no external funding for this work.
Author information
Authors and Affiliations
Contributions
Not applicable (single author).
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Yes.
Competing interests
The author declares no conflict of interest.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Franke, U. Rawlsian Algorithmic Fairness and a Missing Aggregation Property of the Difference Principle. Philos. Technol. 37, 87 (2024). https://doi.org/10.1007/s13347-024-00779-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13347-024-00779-z