PrefVote is designed for multiple programming language implementations using a common test suite. The test suite consists of “black box” test specs used across all languages, plus “white box” unit tests within each language’s source code directory. The test harness collects Test Anything Protocol (TAP) data from each language’s unit tests to report the results.
Current status: The reference implementation is in Perl, and takes advantage of TAP (Test Anywhere Protocol) which originally started with Perl unit-testing system developers. PrefVote in Perl has extensive whitebox tests. And the blackbox tests for all (future) language implementations is in Perl. The next planned language implementations will be Rust and Python.
Numbers in each cell are test cases planned/passed/failed.
language/set Core STV Schulze RankedPairs KR2 total Perl whitebox 525/525/0 193/193/0 183/183/0 139/139/0 178/178/0 1218/1218/0 Rust whitebox 𝟬 𝟬 𝟬 𝟬 𝟬 0/0/0 Perl blackbox 12467/12467/0 14526/14526/0 16336/16336/0 13848/13848/0 49748/49748/0 106925/106925/0 Rust blackbox 𝟬 𝟬 𝟬 𝟬 𝟬 0/0/0 total 12992/12992/0 14719/14719/0 16519/16519/0 13987/13987/0 49926/49926/0 108143/108143/0
Randomly-generated black-box test data files and their run results for each voting method can be found in these overview files. They are generated by the test-overview script, which also calls acr-compare and vote-count. See below for what to watch for in the comparisons between voting methods over each test data set.
Column headings in result overviews are as follows:
To test the KR2 algorithm, the test suite was expanded to use a 2-axis coordinate system to position candidates and voters along two hypothetical political topics. The ranges span from -1 to +1 on each axis. It doesn’t matter what the issues are. It may be helpful to look at them like right (conservative) versus left (liberal) and up (libertarian) versus down (authoritarian). But it could just as easily be dog versus cat and roses versus chocolate preferences. These are just used for modeling how voters would prefer or oppose candidates based on their distance across the political spectrum. It’s applying some mathematical game theory to random generation of ballots for testing the algorithm.
The diagrams in the KR2 test suite show candidates labeled with letters and unlabeled points for voters. In theory, candidates deeper inside the cluster of voters should be more centrist and end up more preferred, and ones out in the fringes should be more distant and less preferred.
The things to watch for when looking at the voting methods compared in each data set are the Average Choice Rank (ACR) and Copeland Score. Copeland Score is the generic Condorcet result. The Condorcet methods (Schulze, Ranked Pairs and KR2) will all agree with Copeland when it has a unique result. When Copeland has a tie (which means a Condorcet Paradox), the Condorcet methods may break ties differently. All the algorithms are shown with and without ACR for tie-breaking, except KR2 which always uses ACR as a tie-breaker. This comparison also makes it evident how often Single Transferable Vote (STV) fails to pick the Condorcet Winner and quickly differs in the rest of the result order.