Monotonicity and IRV -- Why the Monotonicity Criterion is of Little Import

Opponents of instant runoff voting (IRV) claim that it fails the “monotonicity criterion.” What does this mean? And is this significant in real world elections? Before explaining why this failure is of little consequence – certainly of far less consequence than the majority criterion and “later-no-harm” criterion that IRV meets, but other proposed voting methods violate – a little background is useful.

What are voting method criteria?

Numerous formal criteria, that make good sense, have been proposed for evaluating voting methods. But no voting method satisfies all of them, as some are mutually exclusive. Since no voting method satisfies all criteria (those frequently cited and others yet to be devised), it is important to have a sense of how important a criterion is and how often any particular voting method is likely to exhibit a problem.

There can be different reasons for failing a criterion. First, selecting a set of criteria to examine can be arbitrary in the sense that there is no definitive set of criteria to use – any advocate of a particular voting method might believe the criteria that are met by that voting method are more important than the criteria that voting method fails. Second, the specific way in which a criterion is defined may cause some voting methods to flip from failing to meeting the criterion.

For example, one criterion that most people believe is of crucial importance is the “majority criterion.” This can be defined as: “If more than 50% of voters consider a particular candidate to be the absolute best choice, then that candidate should win.” Some proposed voting methods, such as range voting and approval voting fail this criterion. Advocates of these voting methods generally take two approaches in confronting this reality. Either they argue that the majority criterion is not really that important, or they attempt to modify the definition of the majority criterion so that their preferred method doesn’t fail it. For example, the majority criterion could be re-defined so that approval voting passes by saying that as long as any of the candidates who are  “approved” by a majority of voters wins (under approval rules multiple candidates can exceed 50%) , then the re-defined criterion is met. Approval voting would meet this re-defined majority criterion even if the candidate that an absolute majority (more than 50%) thinks is the best choice is defeated by a candidate that nobody (0%) thinks is the best choice, as can happen with approval voting.

It is also important to be clear about the meaning of “meeting” or “failing” a criterion. Failing a criterion is not necessarily absolute. As used in election theory, a criterion is met only if a voting method satisfies it in 100% of conceivable elections. A voting method might comply with the criterion in 99.9999% of cases, but still be said to “fail.”

What exactly is the monotonicity criterion?

Now we turn to the monotonicity criterion. Monotonicity can be defined as follows: A candidate x should not be harmed (i.e., change from being a winner to a loser) if x is raised on some ballots without changing the relative orders of the other candidates.

Here is a standard explanation of IRV failing a monotonicity criterion paraphrased from Wikipedia. Suppose there are 3 candidates, and 100 votes cast. The number of votes required to win is therefore 51. Suppose the votes are cast as follows in an IRV election. 

         Number of ballots          1st Preference        2nd Preference
39 Andrea Belinda
35 Belinda Cynthia
26 Cynthia Andrea
 

No candidate has a majority of the vote. Last-place candidate Cynthia is eliminated, and in the instant runoff her votes count for Andrea, who wins in the second round with a majority of 65 to 35.

Now suppose 10 Belinda voters drop their support for her and rank Andrea first instead.

       Number of ballots         1st Preference          2nd Preference
49 Andrea Belinda
25 Belinda Cynthia
26 Cynthia Andrea
 

Andrea again is the plurality winner on the first count, but falls short of a majority. This time, however, Belinda is in last place. She is eliminated first this time, and in the second round all ballots cast for her are counted for Cynthia, who vaults to a victory 51 to 49. In this case Andrea’s preferential ranking increased between elections - more voters put her first - but this increase in support appears to have caused her to lose because they led to Belinda being eliminated instead of Cynthia.

In order to emphasize the appearance of a paradox, criticism of IRV based on non-monotonicity is frequently presented in a misleading way, along the following lines: “Having more voters rank candidate Andrea first, can cause Andrea to switch from being a winner to being a loser.” This is not correct, however. It is not the fact that Andrea gets more votes that causes her to lose. In fact getting more first preferences, by itself, can never cause a candidate to lose with IRV. With regards to additional voters casting votes that rank Andrea as the top choice, IRV is indeed monotonic.

The actual cause of a non-monotonic flip with IRV is the shift of support among other candidates (the decline in support for candidate Belinda in the Wikipedia example above), which changes which candidate Andrea faces in the final match-up. The fact that those ten voters shifted to Andrea was irrelevant, and did not cause Andrea to lose. The result would have been the same if those voters had shifted their votes to a fourth candidate or not been cast at all.

This difference is quite important. The rhetorical impact of this reality is less persuasive and certainly sounds a lot less paradoxical—i.e., now the failure becomes  “If support for other candidates shifts so that candidate Andrea faces a stronger opponent in the final runoff, Andrea could switch from being a winner to being a loser.” Indeed it is this “paradox” that is often the basis for primary election campaigns in our system where a candidate makes the claim of “electability.” Essentially that candidate is saying, “you might like my primary opponent better, but I am a stronger general election candidate.”

Note how this example illustrates an important point about hypothetical voting examples concocted to demonstrate pathologies. They are often extremely unrealistic, which can be lost in a blizzard of A’s, B’s and C’s. In this case, in order to switch from Belinda to Andrea, 10 voters have to skip over their original second choice, Cynthia, in favor of their original last choice. And this has to happen without any other changes taking place in the electorate. How often is this going to happen in real elections?  

What does this mean, and is monotonicity significant in real world elections?

In terms of the frequency of non-monotonicity in real-world elections: there is no evidence that this has ever played a role in any IRV election -- not the IRV presidential elections in Ireland, nor the literally thousands of hotly contested IRV federal elections that have taken place for generations in Australia, nor in any of the IRV elections in the United States.

True, in theory, in a close election, if enough supporters of candidate A knew enough about the likely rankings of other voters they could, in some rare situations vote strategically as follows: Instead of ranking their true favorite as number one, they could give that first ranking to the weaker of the two likely opponents in the likely final match-up with A, in hopes of helping their favorite candidate win in the final runoff tally. Indeed you can see this happen in traditional runoff systems or in “open primary” systems – consider Rush Limbaugh’s “operation chaos” strategy in the 2008 Democratic presidential nomination where he urged his conservative radio listeners to vote in the Democratic primary for Hillary Clinton, secure in his knowledge that John McCain was already assured of receiving the Republican nomination.

But this scenario is far-fetched in IRV elections for a number of reasons. Firstly, it is a tremendously risky venture, since if too many voters follow the strategy it could seriously backfire and cause the favored candidate to be eliminated before the final runoff is reached or lose in the final runoff. Unlike the Limbaugh strategy in the 2008 Democratic primary, one’s true first choice isn’t guaranteed a spot in the final pairing without real support – with IRV, voters don’t get to switch their first choice between rounds, and so lack of monotonicity is less significant than with two-round runoff elections, which also fail the criterion. Second, the strategy would also require a substantial amount of reliable information about the likely first and alternate rankings of other voters – information that will not be easy to obtain, and certainly not in a way that would likely govern voting decisions. Combined with the fact that the strategy is counter-intuitive, these facts make its use extremely unlikely.

The “later-no-harm” criterion is far more important than monotonicity because, unlike the monotonicity failure, it has direct strategic consequences. In a nutshell, a voting method fails the later-no-harm criterion if there is a risk that by indicating a second choice in any way (a ranking as in the Borda count or Bucklin system, another vote as in approval voting and points as in range voting), a voter might help defeat his or her first choice. This criterion has serious real-world implications, as there is substantial evidence that it leads some voters to honestly rank only their favorite choice under such methods as approval, Bucklin, Borda, Condorcet and Range Voting. Even worse, perhaps, would be if many voters grasped the strategic value of such “bullet voting” and many others didn’t, thereby giving insincere tactical voters a big advantage over sincere voters casting ballots as the instructions suggest they should. Thus all of the mathematical niceties of these other methods go out the window by voters’ refusal to play along and risk hurting their first choice.

Monotonicity has little if any real world impact, and voting methods that satisfy that criterion tend to fail the majority and later-no-harm criteria, which can dramatically affect voting behavior and produce what are considered by most to be undemocratic outcomes.