In parliamentary democracies, all voters are equal, but some carry more weight than others. On the one hand, sizeable numbers of votes are lost in winner-takes-all pro- cedures, for example, in the British district system by which the Liberal Democratic Party got over 23% of the popular vote but only 9% of the seats in the 2010 parliament elections. Electoral thresholds for other parliaments result in null weights for votes cast for small parties. On the other hand, even in an almost perfectly proportional parlia- ment like the Dutch, coalition formation retrospectively weakens votes for the parties that finish up in the opposition. Thus implicitly but effectively, votes are weighted.
The author presents a rationale (Section 2) and a general model (Section 3) for explicitly but less crudely weighting votes in parliamentary and related elections. Strict proportionality (equal weights) and winner-takes-all (dichotomous weights) appear to be special cases in the model, which allows for more flexible and elegant solutions than do the customary uneasy combinations of these two special cases. The author first app- lies the model to the single-vote procedure. However, the model appears to develop its full potential in a dual ballot structure, by which voters may nominate a coalition part- ner in addition to their party of choice.
2. A Rationale for Weighting Votes
2.1. Conceptions of Voters
Weighting, particularly, discarding votes on the way to majority or plurality formation is often contrasted with proportional representation, that is, unit weighting (see, e.g., Shugart, 2001 ). Invariably, however, the votes that are annihilated or weakened are statistical minority votes, which by definition are less representative of the aggregate vote. Therefore, differential weighting in fact enhances representativeness. Any objec- tions to the practice or the principle of weighting are thus objections to enhancing representativeness. Without pretending to exhaust the issue, I analyze whether explicit weighting can be justified.
Null weights may be assigned a priory on a juridical basis, as to minors, recent immi- grants, and other categories of people, like women and slaves in the past. However, this is now a marginal affair. In a constitutional conception of democracy, which empha- sizes the maintenance of human rights, one might wish to discount the votes of angry masses, but there is no way to do so in the context of universal suffrage. In the extreme, a party may be outlawed, but its potential voters cannot. Conversely, it is not possible to overrepresent a right-minded elite (if such a thing exists) among the electorate. The juridical perspective cannot provide a justification for the practice of weighting votes a posteriori, once they have been cast.
A justification of weighting can be based on an intersubjective rather than an indivi- dual conception of voting. The formal roots of that perspective may be traced to psy- chometric theory ( Spearman, 1910 ; Brown, 1910 ). In an intersubjective interpretation, the citizens’ task is to actively represent the electorate rather than just having them- selves passively represented. Votes are thus conceived as evaluations (Hofstee, 2009) of party programs and candidates, against the background of their past performance. I briefly set out a general evaluation script (Hofstee, 1999) .
2.2. An Evaluation Script
Empirically, individual evaluations tend to show positive but low mutual agreement, indicating that they make some contribution to the common component but are far from fully representative. Therefore, multiple judges, in the shape of a committee or panel, are needed to ensure an acceptable level of representativeness of their common component vis-à-vis the latent criterion, through canceling out the individual compo- nents. Somewhat confusingly, the multiple-judges mechanism is sometimes applied in estimating or predicting matters of fact rather than value, like in predicting the factual outcome of an election (e.g., Hofstee & Schaapman, 1990 ) or similar playful applica- tions. It has thus gained some reputation as a truth-tracking device. Its proper place, however, is in matters of value, in which objective criteria are absent. Particularly, it is inappropriate to carry the confusion to the point at which the common component of evaluations is assigned truth status.
The representativeness of an individual judgment, in its turn, may be estimated by taking the common component of the panel judgments as a proxy to the latent cri- terion, assuming that the panel as a whole is sufficiently representative (that is, sufficiently large and unbiased). To maximize the common panel component and the- reby the representativeness of the panel judgment, the individual judgments should be weighted according to their agreement with the common panel component. The wei- ghting is thus recursive (for demonstrations, see below), as the individual judgment is part of the panel judgment. In face-to-face or other interactive settings ranging from conference calls to chat boxes, optimization may be sought through deliberation; however, such group processes tend to introduce other factors than quality.
2.3. Reservations Regarding Representativeness
The next Section set outs a model based on the conception of voters as evaluators of the quality of parties in the context of parliamentary and related elections. The model simulates and refines the practice of weighting votes. It bridges the opposition between proportional and majoritarian systems. It provides a foundation for a point system, by which dual votes can be cast. In presenting the model, the author does not assume that all political voters actually behave like detached evaluators of the political quality of programs, candidates, party records, and the like, although some probably do. The preliminary question here is whether that conception is normatively defensible, so that a voting system that promotes it would be in order.
The problem is aptly phrased by asking what it means to say that “the voter is always right”. In the subjective conception, there is no such thing; against that background, the saying can only be understood as a shrugging admission that the voter, like the con- sumer, cannot be called wrong. In the intersubjective conception, individual voters are mainly wrong, as their contribution to the common component tends to be quite mo- dest. However, the supposition is that “the voter”, as a collective noun, is right enough to provide a standard against which individual votes may be weighted. Is that supposi- tion acceptable? Can the collective voter be trusted as a point of reference for the indi- vidual?
The latter question is not as rhetorical as universal suffrage would suggest. The electorate here and now constitutes just a sample of humanity, and a biased one at that, even if it comprises the whole “population” of a national citizenry. History may prove it wrong by reasonable standards, and foreign contemporaries may rightly be appalled at the outcome of a particular election. These wider diachronic and global definitions of humanity come to the fore, most notably, when elections would lead to infringements upon human and civil rights, for example, through discrimination.
In the evaluation perspective, one should not not appeal to strong idealistic assump- tions like the existence of a unitary volonté générale of the people (which Rousseau, 1762 , would not himself apply to the situation of parliamentary democracy, but rather limit to bands of peasants regulating the affairs of state under an oak tree). For, civil values like safety and privacy, solidarity and liberty, national and global citizenship, and so on, form a mixed bag with strained mutual relations, attracting different tempe- raments. Democracy may be conceived as a precarious balance between such values, which comes about in a continuous dynamic process of trial and error. The idea of some knowable truth about society has mainly functioned as food for fundamentalists and their followers; the mission of democracy is rather to prevent its dark history from repeating itself. The democratic logic consists of more mundane expressions like least of evils, trust as the absence of mistrust, and double negatives in general.
So, the question should not be whether the citizenry has grown to full democratic maturity; rather, the phrasing should be whether the average voter is sufficiently repre- sentative of humanity to function as a criterion, so that representativeness is enhanced through weighting votes against that standard. At least for post-war Western democra- cies, an affirmative answer seems in order. Weighting according to represent ativeness is accepted practice; universal suffrage is undisputed; dubious political movements are contained by cordons sanitaires (which, of course, amount to implicit discounting of votes) or attempts at pacification through subsumption in the regular political process; any elitist alternatives, meant to save democracy by suspending it, meet with broad opposition amongst the elites themselves. Contemporary Western democracies show a sufficient absence of distrust about representativeness.
One may object that the surge of social media is turning contemporary societies into a dystopic version of deliberative democracy, to the benefit of populist parties with a problematic relation to the constitution. However, on the reasonable assumption that the angry masses will not reach majority, there is all the more reason to stimulate and reward representativeness. In any event, the objection is invalid, for, in the theoretical case in which the assumption would not be fulfilled, there is no alternative democratic solution at all.
A remaining question is whether society has the right to approach the voter as an off- icial whose task it is to represent the electorate, rather than express subjective preferen- ces. On the one hand, the right to discount or even discard unrepresentative votes is accepted in practice. On the other, any sanctions against the citizen for statistically un- representative voting would be impossible, if only because of privacy in the booth; also, serious voting is generally not regarded as an enforceable civil duty, like paying taxes. Thus people can only be nudged in the direction of representative voting, particularly through weighting. From an evolutionary-democratic point of view, one might add that political diversity and dissent are indispensable in the development of democracy.
3. The Power Model
3.1. A Hypothetical Example
An elementary example of recursive weighting starts with the outcome of the 2010 election of the Dutch parliament. The n = 1 row in Table 1 gives the way in which the proportions of votes were distributed―or scattered―among the 10 parties that gained any seats at all. Take the proportion of votes gained by a party as an indicator of that party’s representativeness, and therefore as a measure of the representativeness of those who voted for that party. Retrospectively weighting all individual votes according to that indicator amounts to squaring the proportions and dividing by the sum of squares. The resulting “meta-representative” distribution is given in the n = 2 row. The effect of squaring is concentration of political power. Note also that the two smallest parties
Table 1. retrospective simulation of an election outcome.
Note: VVD = Folk Party for Freedom and Democracy, PvdA = Labor Party, PVV = Freedom Party, CDA = Christian Democratic Call, SP = Socialist Party, D66 = Democrats’ 66, GL = Green Left, CU = Christian Union, SGP = Political Dutch Reformed Party, PvdD = Party for the Animals.
would not anymore have reached the .0067 electoral threshold in this 150-seat parlia- ment, making crude measures like the fixed 5% threshold for the German Bundestag and other parliaments superfluous to some extent.
The quadratic rule may be generalized into a superordinate power model, which consists of raising the obtained proportions of votes to the nth power. Integer values of n may be arrived at through an iterative reasoning, according to which votes should be weighted by the squared proportions, giving n = 3, and so on ad infinitum. At the meta-representative extreme, n = ∞ leaves only the largest party (however small), and thus simulates the winner-takes-all or plurality formula. Unweighted aggregation is expressed by n = 1. Exponents 0 < n < 1 would have the effect of flattening the distri- bution and empowering minorities; with n = 0, each party would obtain the same number of seats, making elections futile and reflecting the idea of liberal anarchy. (In the Netherlands, a familiar dictum is that voters will not be satisfied until each has founded their own party). Still, “inversely representative” formulas might be worth considering for correcting highly skewed distributions over parties. Negative exponents would inverse the rank order of the parties; at the negative extreme of n = −∞, all seats would be assigned to the smallest party, thereby mirroring Nietzsche’s inverse-demo- cratic aphorism saying that any consensus of the people can only regard a folly. However, for practical purposes, the discussion may be limited to positive real num- bers. The power model thus writes the proportional and winner-takes-all rules as mere instances of a superordinate power continuum, under one and the same rationale of representativeness.
3.2. Feed-Forward Effects
Retrospective simulations like in Table 1 are to a large extent unrealistic. If the voters would have known in advance that their votes were going to be explicitly weighted according to representativeness, they might have voted more representatively, as they might wish to maximize their impact on the outcome, or would come to subscribe to the logic of democratic citizenship that is symbolized by the weighting rule, or both. So the next question when evaluating the consequences of a voting system is what would happen if more voters―ultimately, all voters―would behave accordingly. The purpose of that analysis is not to assert that they would, but to test the viability of the system.
For any n > 1, additional voters would be expected to support the larger parties; the smaller the party they would have supported otherwise, the greater the expected gain if they back the winner, so the global effect would be roughly equivalent to further raising the n parameter. If all voters would have themselves nudged by the system, and if the polls (which would also be subject to the booster effect) indicated a clear winner, the ultimate effect would be identical to raising n to infinity, thus filling the whole par- liament with representatives of the winning party and doing away with parliamentary opposition altogether. That might be found too much of a good thing even by those who deplore the fragmented state of many contemporary parliaments.
Next, a probably more pervasive feed-forward effect of weighting concerns the poli- tical parties. For n = 1, it is not very profitable for adjacent parties on the political spec- trum to form an alliance or merger. Going together tends not to deliver many extra seats, as the sum of voters for the adjacent parties in question tends to be more or less constant. However, for n > 1, the gain is automatic, simply because then (x + y)n > xn + yn. Weighting votes for representativeness thus unequivocally stimulates the forming of mergers and alliances. One plausible end state would be a two-party constellation, Anglo-Saxon style, which is generally but undeservedly seen as a consequence of the district system. Another, much less favored by many, would be a one-party democracy.
The way to contain such positive-feedback loops is to equip the power model with a thermostat-like construction. It consists of deciding on a policy parameter P that defi- nes a desirable outcome, like in setting the room temperature. The desired outcome could be a parliament in which one party or alliance of parties has a majority but not an overwhelming one, say, 50% < P < 60%. The power parameter n thus becomes flexible (and fractional), as it depends on the outcome of the election. If the largest party or alliance receives less than 50% of the votes, n > 1; if more than 60%, n < 1; so, n also functions as an air conditioner, in case the electorate would get overheated by the representativeness logic. Decreasing n would not help in the limiting case of all voters supporting one party, but one can assume that oppositional mechanisms would prevent this case from arising. The 50% < P < 60% formula would fit the British tradition which abhors the embarrassment of a “hung parliament”, continental style, in which no party has a majority. The British district system would be incompatible with the power model, as the distribution of seats over parties is not based on the popular vote (like it is, for example, in Germany), but that system would be superfluous for the purpose, apart from its proven insufficiency in the 2010 British elections.
For continental and other similar situations, even a less radical formula would elicit major changes, in a direction that is looked upon favorably by many political theorists. It would reverse the process of increasing parliamentary fragmentation by which, for example, the two largest parties in the Dutch parliament together occupied just 41% of the seats in 2010, as opposed to 68% in 1977. It would counteract the untransparent process of protracted post-election coalition formations, which tend to be frustrating to everyone involved, from the individual citizen and delegate up to the head of state. Notably, it would do so without effacing minor parties, as in the Anglo-Saxon system: as more voters would come to concentrate on the major parties, n would become smaller, so that a principled minority would not undergo much reduction in seats, and would even get boosted in the overheated situation. The power model thus integrates the Anglo-Saxon majoritarian or Westminster model and the continental consocia- tional or consensus model of parliamentary democracy (see, Thomassen, 2010 ). Evi- dently, the model does not deal with the personal filling-in of the seats, particularly as a consequence of quota for districts, sexes, minorities, and whatever else; it only assumes that such factors do not contaminate the primary distribution of seats over parties.
By virtue of its own logic, the pretension of the power model is that it would pass in parliament. One impediment could be the kind of political wisdom that opposes radical changes, for example, increasing the size of the largest delegation in the Dutch parliament from 21% to 50%. A lower P value, for example, 33% < P < 50%, should solve that problem. Another obstacle may be found in the arithmetic with fractional powers needed for its implementation: opponents might argue that these are too difficult to understand for the voter. The answer, of course, is that only the rationale and the effects of the system need to be clear, in the way one does not even aspire to look under the hood of other machinery. Nonetheless, the argument tends to have rhetorical impact, so that a simplified approximation to the model might be preferred, for example, giving a bonus of Q seats to the largest party or alliance, and dividing up the remainder proportionally among the parties that score above a moderate electoral threshold. French municipalities with over 3500 inhabitants apply an extreme version of this formula, with Q = 50% of the seats and an electoral threshold of 5% (see, www.legifrance.gouv.fr). A more elegant simplification would be to give each party a bonus proportional to the number of votes cast for that party.
A final objection might be that the power model invites strategic political behavior by the voter, usually denounced as “insincere” voting in the literature. The assumption is that the voter has a sincere or true subjective political preference, and that the quality of society is served by eliciting that metaphysical parameter in the polling booth. In the intersubjective conception, on the contrary, the voter is a participant in a social game, in which one’s behavior is as much a function of others’ expected behaviors. This conception is eminently realistic in view of actual strategic voting, particularly in favor of larger parties, and in view of the massive interest in polls preceding elections. It is also difficult to see why strategic behavior is insincere in an ethical sense.
3.3. Multiple Voting
A generalization of the single-ballot structure is to spread more votes or points over candidates. One class of examples is range voting, whereby the voter grades each candidate, or in the present context, each party; a special case is approval voting using a dichotomous scale. Another, more problematic, class is ranking, complete or partial, with or without ties. Classical multiple voting, in the parliamentary context, implies a linear (n = 1) scoring rule: grades or ranks per party are the unweighted means of the voters’ grades or ranks.
A good reason to consider multiple voting is that the voter, rather than the party leadership, would decide actively on the (relative) alliances among parties. That mecha- nism would take the place of the electoral alliance formula in the single-vote structure, which is frowned upon because it may lead to opportunism, and is even outlawed in some electoral systems. It would also alleviate the pressure to amalgamate parties. Moreover, a differentiated voting structure permits a closer approximation to the indi- vidual prediction of the collective vote, as in the intersubjective conception, or to the subjective preference profile if one wishes.
However, unweighted averaging of multiple votes has predictable effects. If A is the party of your first choice, you should realize that any positive grading of other parties reduces the number of seats for A in this fixed-sum game, the sum being the total number of seats. That would be all right if all other voters would reciprocate your balanced judgment, but you cannot expect that. For one thing, campaign teams (of other parties, of course) will stress the need to bury all competitors. For another, especially supporters of parties at the edges of the political spectrum may be expected to engage in burying, as their average distance to other parties is larger by definition; the effect would be overrepresentation of such relatively unrepresentative parties, and further fragmentation. Given these considerations, your best strategy, however reluc- tantly, would be to give a maximal grading to the party of your first choice, and null gradings to all others. Thus in a rational electorate, multiple voting would sooner or later degenerate into the single-vote system. In this sense, taking the unweighted ave- rage is an improper scoring rule.
A nonlinear scoring rule is the central ingredient in the Majority Judgment system developed by Balinski and Laraki (2007) . It consists of taking the median rather than the mean of the gradings received by a candidate. The authors prove this rule to be superior in eliciting “sincere” voting. However, it is difficult to see how it could refrain rational voters from burying political opponents and close competitors―an attitude dismissed as “crankiness” by Balinsky and Laraki―in the multi-party situation, so that the system would still degenerate into single voting. An added risk is that the median score for all parties could well be zero, resulting in an empty parliament.
To explore whether the power model can save multiple voting from degenerating into the single-vote system, consider a ballot structure in which every voter is given the opportunity to nominate one coalition partner, giving 2 points to first choice and 1 to second. This is probably the most realistic and practical multiple-vote system. It is not assumed that every voter uses the opportunity to give a second vote, as obligatory voting is unrealistic altogether. Linear averaging of these dual votes should lead to the burying of all possible coalition partners, thus the demonstration below is proleptic: it assumes that the voters have anticipated power iteration. The exposition is informal; however, the underlying algebra is easy to trace (e.g., Horst, 1963 ).
Take the following crude and necessarily fictive but not entirely unrealistic example with 5 parties (Table 2, columns) and 9 types of voters (rows), two of which occur twice; each of the 9 rows A to I stands for 1/9th of the electorate. Parties are left-wing (LL), center left (CL), center right (CR), right-wing (RR), and populist right (PR).
The supporters A of LL give their second vote to CL; B and C reciprocate by indicating LL as a coalition partner. Supporters D and E of CR are split between leaning left and right. RR supporters F veer towards CR, as do the PR supporters G. However, H and I have other things in mind than coalition formation (for an example, think of a populist movement that derives its thrust from a fight against “old” politics). Con- versely, there are no secondary votes by supporters of other parties in the PR column, which is probably also realistic.
Applying the power model (see Table 3) starts with taking the unweighted (n = 1) column sums of the votes and dividing them by their grand total, resulting in seat proportions per party as in row p1 in Table 3. The next step is to apply these pro- portions as weights to the votes in each row. For example, A gets a representativeness raw score of .16 × 2 + .24 × 1 = .56. Dividing these scores by their sum gives the voter weights in column w2. To get the party proportions for n = 2, apply these weights w2 to the parties; for example, LL obtains a weighted raw score of .11 × 2 + .12 × 1 + .12 × 1 = .46, which translates into a seat proportion p2 of .16.
Under the single-vote system, further iteration would ultimately (for n = ∞) lead to the largest party occupying all the seats; with multiple voting, it does not. The solution
converges to the meta-representative distribution p∞ in Table 3. Larger parties gain in the iteration process, as in single voting, but the additional mechanism is that parties profit from empirical associations with other parties. Most notably, PR, which would have been the largest party in single voting, ends down in fourth place because of its relative lack of coalescence. (The generality of this mechanism follows from the fact that p∞ is proportional to the first eigenvector of the matrix of associations between parties as in Table 4; that vector is a function of both the diagonal cells and the
Table 2. Multiple Votes (explanation see text).
Table 3. Power Iteration (explanation see tekst).
Table 4. Associations between parties (explanation see tekst).
off-diagonal cells. With single voting, all off-diagonal cells would automatically be zero). Note also that an LL-CL coalition would gain a majority at p∞, by virtue of its tightness relative to the R side of the spectrum.
Thus retrospective power iteration of these dual votes appears to bring the power model to full bloom: not only does it reward representative voting; it also rewards nominating a coalition partner, as that increases the corresponding off-diagonal cells and thus enhances one’s primary choice. It thus favors parties that occupy a central position in the political space, thereby enhancing meta-representativeness; in other ter- minologies, it constructs political power as a function of the connectedness or com- munality of the parties in addition to their individual strengths.
Still, the next question is about its feed-forward effects: how would the system work out if voters would adapt to it? In the Table 3 example, the highest-scoring individual profile, against p¥, would be [0, 2, 1, 0, 0]. If all voters would choose this profile, CL would score a 2/3 majority, CR filling the rest of the seats. Again, there would be no way of correcting this outcome, but one may assume that it would not eventuate. For intermediate results, a P parameter would be effective, for example, 50% < P = (pa + pb) < 60%, with P the desired sum of proportions for the two largest parties a and b. In the example above, n = 2 would be the smallest integer value that meets P. Success is not guaranteed for highly scattered vote distributions, but that would be a fact of life.
On the basis of an intersubjective conception of political voting as official business, in which the citizen should be encouraged to actively represent the population, the author has demonstrated that power iteration applied to the votes matrix does just that, both retrospectively and prospectively. That goes for single voting and for an elementary version of multiple voting, the latter having the added merit of encouraging coalescent voting behavior without needing electoral alliances or party mergers. These results do not necessarily generalize to other multiple voting schemes, for example, a range voting version in which voters could rate any number of coalition partners: it is readily seen that in anticipating power iteration they could maximize their representativeness and impact by giving all other parties the second highest rating, leading to flattening of the seat distribution. The multiple-voting scheme used here employs a curtailed version of the harmonic series. Along those lines, more differentiated schemes might work if the need would be felt.
The author thanks Rudi Andeweg, Jos ten Berge, and Jacques Thomassen for their comments on draft versions of this paper.