Two Attacks on Commit-Reveal Schemes
Doug Hoyte, June 2024
I recently came across an excellent article, The magic of participatory randomness, that describes from first principles a commitment protocol for collaboratively generating random numbers. Here is the protocol:
- generate a number so large that it is unlikely anyone would ever guess it in your lifetime - 256 bits ought to be sufficient
- take the hash of your number using your cryptographic hash function
- share that hash with other participants in the protocol - hash functions are irreversible, so they won't learn your number
- once everybody has shared their hash, you may now share your original 256 bit number
- check everyone else's numbers to confirm they produce the same hash - mismatches indicate that they cheated
Once you have everybody's numbers and confirm that the commitment scheme was executed honestly, you can xor all the original numbers together to produce 256 fair random bits.
The article concludes with the following:
[I]f you ever require randomness that you can really, really trust, you should now be properly equipped to manage it.
Since I could not find them well-described anywhere else, I decided to write up a short article on two subtle attacks on this protocol. Even if they aren't directly applicable, if you are implementing a protocol like this you should be aware of them.
Attack 1: Copied Commitments
This attack focuses on the commitment phase of the protocol. If half of the participants are dishonest they can predict the output number, and a dishonest majority can arbitrarily select it.
Suppose there are two participants, A and B. In the commitment phase, B waits until A reveals its commitment hash. After B receives it, B shares that same hash as their commitment. At this point the commitment phase ends, and the reveal phase begins. Again, B waits until A reveals, after which B "reveals" the same number that A just revealed.
Since the protocol proceeds to XOR these revealed numbers together, the resulting output will be 0 (XOR is its own inverse). Since B initiated the attack, it knew this was the outcome from the beginning, meaning it was not fair.
This attack generalises to a larger set of participants: If there are N honest and N dishonest/colluding participants, then each dishonest participant can copy the hash of a respective honest participant, "cancelling out" all of the honestly chosen numbers.
If the dishonest participants form a majority (N+1 or more) then they can perform the above cancelling and then also include their own number(s), which allow them to arbitrarily choose the output. They could choose the most advantageous possible value or, if that would be too obvious, then they might instead slightly bias the iterated result in their favour so as to stay undetected for a longer period of time.
Solutions
Fortunately, there are several possible ways to prevent this attack:
- Fail and restart the protocol if a duplicate commitment is detected. This has the disadvantage that a malicious participant can "grief" the protocol (prevent it from working for no direct benefit to themselves).
- Use only one value for any duplicated commitments. This prevents the attack and protects the desired property that any participant acting honestly will cause the output number to be unpredictable.
- Combine the revealed numbers with something other than XOR. If a suitable combination operation is used, this prevents the attack and has additional benefits that we'll discuss below.
Attack 2: Last Revealer(s)
This attack focuses on the reveal phase of the protocol. A single dishonest participant can influence (bias) the output number. If they collude, the amount of possible bias increases as the group of dishonest particpants grows. If the number of dishonest participants is greater than or equal to the number of bits in the output number (256 in our case), then the dishonest cartel can arbitrarily select any output.
Unlike the previous attack, this one is more fundamental and does not have any simple fixes, although I'll mention some ways to reduce its impact.
Consider the state of the world after everyone has committed, and everyone has revealed except for one participant. This last participant can calculate the final output before anybody else because it knows everyone else's revealed numbers and its own, which is currently secret. Of course, because the participant has already committed to their own number, they cannot change the outcome. However, at this point they do have one last choice: whether they should reveal their number at all.
If the output is disadvantageous, then they may choose to not reveal. Depending on the context this may be obviously malicious, but more often this will have a plausibly innocent explanation, such as a computer crash/internet failure/etc.
Since this is sure to happen in the real-world (maliciously or not), the protocol designer must make a choice about what to do in the event that, after some time-out period, not every number is revealed. Broadly there are two options:
- Restart the protocol. In this case, the last revealer gets "another chance" at a good number. If the revealer is allowed back in to the subsequent rounds, then they could simply perform same action again and again until they get an advantageous result. Even if not allowed back in, the last revealer has influenced the result and at least avoided some bad outcome.
- Continue the protocol using the commitments that actually were revealed. In this case, the last revealer gets "another chance" at a number that could benefit them. This number will certainly be better for them, because otherwise they would've revealed. Again, the last revealer has influenced the result to their benefit.
Assuming the protocol continues on, this attack can be further generalised as the number of dishonest participants increases. If there are N colluding participants, and they wait until all the honest participants have revealed, then they as a group know the final outcome. In this case, they have 2N possible choices, corresponding to whether each of them will reveal their number or not. Even relatively small values of N will provide a large number of selections, thanks to the rapid growth of exponential functions. The cartel can pick the absolutely most advantageous number, or an slightly beneficial innocent-looking number to avoid detection.
If N is equal to the number of bits in the output number (256 in our case) then the cartel can arbitrarily select the output number. To do so, the first member chooses 10000...
as their "random" number (in binary), the second chooses 01000...
, and so forth. Then, for each bit in the output number they want to flip, the cartel reveals the corresponding number and withholds the rest. If these single-bit numbers are too obvious, then legitimate random values can be generated and a different 256-element vector basis of them can be selected with gaussian elimination. You don't need to generate many: 300 is plenty (with high probability). Then, when the positions of the bits-to-be-flipped are known, the revealed elements can be selected, again with gaussian elimination (see our perl script which can do this in seconds).
Solutions
As mentioned above, a protocol designer needs to decide whether or not to restart the protocol if not all commitments were revealed. Because restarting the protocol implies that somebody is able to grief the system (maybe perpetually), systems with open membership typically opt to continue on without all commitments, leaving them vulnerable to this attack.
One possibile solution is to have a trusted party collect all the revealed numbers and only reveal them to the wider group once all numbers were received (or after some time-out). But in this case, the trusted party could simply perform the last revealer attack itself. Anyway, if you have such a trusted party, why not just get them to generate the random numbers in the first place, and skip the whole commit-reveal protocol?
The most general way to protect against this is to have some sort of punishment for failure to reveal. In order to totally dis-incentivise this attack, the punishment should exceed the worst-case outcome for a participant who does reveal. On the other hand, it can't be too severe because, as mentioned, honest users sometimes can't reveal for legitimate reasons (computer crashed, etc).
Combining Function
The final step, where the revealed numbers are combined, uses XOR (⊕). This function's linearity causes problems in both of the attacks above:
- With copied commitments, the attacker knows the value of
X ⊕ X
(it's 0), even though they don't know X. - A large enough cartel of last revealers can rapidly choose combinations of numbers to arbitrarily influence the output number.
You could imagine using something like addition (mod 2256). This helps, but is not perfect. For instance, consider the copied commitment case: X + X = 2X
and is therefore even, meaning the least-significant bit is known to be 0, even without knowledge of X.
A better solution would be to sort all the revealed numbers, concatenate them, and hash. This prevents the copied commitment attack and makes it infeasible for a last revealer cartel to arbitrarily select an output (although they can still heavily bias it, of course).
This change may also help in other known and unknown circumstances (this article does not claim to be comprehensive!), for example if commitments are partially leaked due to side-channels.
Conclusion
Although commit-reveal is an intuitive way to interactively generate randomness that all participants can agree on, there are several protocol and implementation-level details that can, if neglected, compromise security.
Since commit-reveal schemes are participatory, participants must be able to communicate with each-other, in both directions. Protocols can often be griefed or slowed down by malicious participants, and therefore cannot generate new randomness at fixed time intervals. In most arrangements, they also consume total resources (such as bandwidth) proportional to the square of the number of participants.
For an entirely different approach (admittedly with several significant downsides), you may be interested in our research project ephemerand.