Skip to main content
added 1648 characters in body
Source Link
Eric Lippert
  • 15k
  • 4
  • 39
  • 57

UPDATE:

My understanding of new ThreadLocal<Random>(() => new Random()) was that it would generate a new instance of Random in each thread. Is this not correct?

That's correct. And this is the right thing to do, because Random is not threadsafe.

When you say, "I now have knowledge of every voucher that was generated by that thread in the past and the future," do you mean on that particular thread i.e. if 10 codes were generated on a particular batch, and I had access to one, I would then be able to determine the remaining 9?

That is exactly what I mean.

There are only 32 bits of actual randomness in the seed. Which means that there are only four billion possible sequences of vouchers produced by your algorithm. Which means that all I have to do is generate all four billion sequences until I find one that has my voucher, and hey, I now know what seed that thread used. So I know every voucher it generated in the past, and the future.

If that's not a property you care to have, then generate actually random vouchers, not pseudorandom vouchers.

Outside of that batch and thread, would you still be able to determine other codes?

Sure. In particular, I could generate the four billion vouchers that were each generated first by the four billion possible seeds and use those as guesses. Suppose you have generated 1000 vouchers. That gives me odds of one in four million of guessing one of them, once. And of course I can just keep on trying; nothing is stopping me. Compare that to the odds of guessing one of them if they were truly random: 1000 in 26 to the 10, a far larger number.

UPDATE:

My understanding of new ThreadLocal<Random>(() => new Random()) was that it would generate a new instance of Random in each thread. Is this not correct?

That's correct. And this is the right thing to do, because Random is not threadsafe.

When you say, "I now have knowledge of every voucher that was generated by that thread in the past and the future," do you mean on that particular thread i.e. if 10 codes were generated on a particular batch, and I had access to one, I would then be able to determine the remaining 9?

That is exactly what I mean.

There are only 32 bits of actual randomness in the seed. Which means that there are only four billion possible sequences of vouchers produced by your algorithm. Which means that all I have to do is generate all four billion sequences until I find one that has my voucher, and hey, I now know what seed that thread used. So I know every voucher it generated in the past, and the future.

If that's not a property you care to have, then generate actually random vouchers, not pseudorandom vouchers.

Outside of that batch and thread, would you still be able to determine other codes?

Sure. In particular, I could generate the four billion vouchers that were each generated first by the four billion possible seeds and use those as guesses. Suppose you have generated 1000 vouchers. That gives me odds of one in four million of guessing one of them, once. And of course I can just keep on trying; nothing is stopping me. Compare that to the odds of guessing one of them if they were truly random: 1000 in 26 to the 10, a far larger number.

added 566 characters in body
Source Link
Eric Lippert
  • 15k
  • 4
  • 39
  • 57

There are numerous problems here.

  • There are 26^10 vouchers in the voucher space, but only 2^32 possible seeds to the random number generator. And the random number generator has a period of a few billion before it repeats. So most of your space cannot be generated. But that's not so bad. There are plenty of numbers which can be generated.

  • The random number generator on the thread is seeded with the current time. Which means that two threads that start at the same millisecond will produce the same sequence of vouchers forever; it is entirely possible that one of the threads will never succeed in producing a voucher that isn't already in the database. If you think that it is impossible for two threads to start in the same millisecond, I remind you that a millisecond is one million nanoseconds, and computers work on the nanosecond scale. There is plenty of opportunity for collision here. But maybe you can live with that.

  • The really horrible problem is: if I have access to one voucher I can run 2^32 simulations of your algorithm on my fast home machine and very quickly determine what the seed was for that instantiation of Random. I now have knowledge of every voucher that was generated by that thread in the past and the future. Do you want to give that information to everyone who has a single voucher? Basically once one voucher is public, the entire database is public.

  • Even if I don't have a voucher, I can run 2^32 simulations of different threads and generate probably vouchers, and try them out. One of them will be the right one.

This is a correctness and security nightmare waiting to happen.

The right way to do this is to generate crypto-strength vouchers. I leave it to you to work out how to do that.

Now let's suppose that you do. What is the probability of a collision given n unique vouchers already in the database? Obviously n / 26^10. If n is under a few million then obviously this is very small. But here's a better question: what is the probability of there being any collision at any time during the computation of those million vouchers? The chance of any collision rises extremely rapidly once the number of the vouchers in the database is on the order of the square root of the sample space. The square root of 26^10 is only 12 million! So you are very likely to get at least one voucher collision if you are putting around 12 million vouchers in the database.

If you want to make that chance of collision smaller, up the voucher size. Every time you add one more digit to the voucher, you multiply the amount of runway you have until a collision is likely by five.

If this subject interests you, see

https://blogs.msdn.microsoft.com/ericlippert/2010/03/22/socks-birthdays-and-hash-collisions/

for more discussion.

Now, if you don't care about any of this stuff, then there is a cheap way to generate random-seeming vouchers without having to store them in a database. Just number your vouchers one through 1000 and use a multiplicative inverse:

https://ericlippert.com/2013/11/14/a-practical-use-of-multiplicative-inverses/

But these have no security properties whatsoever. If someone knows this algorithm, then they know all the vouchers past present and future. This system is not secure against attack at all; it's just a nice way to "obfuscate" a small number.

There are numerous problems here.

  • There are 26^10 vouchers in the voucher space, but only 2^32 possible seeds to the random number generator. And the random number generator has a period of a few billion before it repeats. So most of your space cannot be generated. But that's not so bad. There are plenty of numbers which can be generated.

  • The random number generator on the thread is seeded with the current time. Which means that two threads that start at the same millisecond will produce the same sequence of vouchers forever; it is entirely possible that one of the threads will never succeed in producing a voucher that isn't already in the database. If you think that it is impossible for two threads to start in the same millisecond, I remind you that a millisecond is one million nanoseconds, and computers work on the nanosecond scale. There is plenty of opportunity for collision here. But maybe you can live with that.

  • The really horrible problem is: if I have access to one voucher I can run 2^32 simulations of your algorithm on my fast home machine and very quickly determine what the seed was for that instantiation of Random. I now have knowledge of every voucher that was generated by that thread in the past and the future. Do you want to give that information to everyone who has a single voucher? Basically once one voucher is public, the entire database is public.

  • Even if I don't have a voucher, I can run 2^32 simulations of different threads and generate probably vouchers, and try them out. One of them will be the right one.

This is a correctness and security nightmare waiting to happen.

The right way to do this is to generate crypto-strength vouchers. I leave it to you to work out how to do that.

Now let's suppose that you do. What is the probability of a collision given n unique vouchers already in the database? Obviously n / 26^10. If n is under a few million then obviously this is very small. But here's a better question: what is the probability of there being any collision at any time during the computation of those million vouchers? The chance of any collision rises extremely rapidly once the number of the vouchers in the database is on the order of the square root of the sample space. The square root of 26^10 is only 12 million! So you are very likely to get at least one voucher collision if you are putting around 12 million vouchers in the database.

If you want to make that chance of collision smaller, up the voucher size. Every time you add one more digit to the voucher, you multiply the amount of runway you have until a collision is likely by five.

If this subject interests you, see

https://blogs.msdn.microsoft.com/ericlippert/2010/03/22/socks-birthdays-and-hash-collisions/

for more discussion.

There are numerous problems here.

  • There are 26^10 vouchers in the voucher space, but only 2^32 possible seeds to the random number generator. And the random number generator has a period of a few billion before it repeats. So most of your space cannot be generated. But that's not so bad. There are plenty of numbers which can be generated.

  • The random number generator on the thread is seeded with the current time. Which means that two threads that start at the same millisecond will produce the same sequence of vouchers forever; it is entirely possible that one of the threads will never succeed in producing a voucher that isn't already in the database. If you think that it is impossible for two threads to start in the same millisecond, I remind you that a millisecond is one million nanoseconds, and computers work on the nanosecond scale. There is plenty of opportunity for collision here. But maybe you can live with that.

  • The really horrible problem is: if I have access to one voucher I can run 2^32 simulations of your algorithm on my fast home machine and very quickly determine what the seed was for that instantiation of Random. I now have knowledge of every voucher that was generated by that thread in the past and the future. Do you want to give that information to everyone who has a single voucher? Basically once one voucher is public, the entire database is public.

  • Even if I don't have a voucher, I can run 2^32 simulations of different threads and generate probably vouchers, and try them out. One of them will be the right one.

This is a correctness and security nightmare waiting to happen.

The right way to do this is to generate crypto-strength vouchers. I leave it to you to work out how to do that.

Now let's suppose that you do. What is the probability of a collision given n unique vouchers already in the database? Obviously n / 26^10. If n is under a few million then obviously this is very small. But here's a better question: what is the probability of there being any collision at any time during the computation of those million vouchers? The chance of any collision rises extremely rapidly once the number of the vouchers in the database is on the order of the square root of the sample space. The square root of 26^10 is only 12 million! So you are very likely to get at least one voucher collision if you are putting around 12 million vouchers in the database.

If you want to make that chance of collision smaller, up the voucher size. Every time you add one more digit to the voucher, you multiply the amount of runway you have until a collision is likely by five.

If this subject interests you, see

https://blogs.msdn.microsoft.com/ericlippert/2010/03/22/socks-birthdays-and-hash-collisions/

for more discussion.

Now, if you don't care about any of this stuff, then there is a cheap way to generate random-seeming vouchers without having to store them in a database. Just number your vouchers one through 1000 and use a multiplicative inverse:

https://ericlippert.com/2013/11/14/a-practical-use-of-multiplicative-inverses/

But these have no security properties whatsoever. If someone knows this algorithm, then they know all the vouchers past present and future. This system is not secure against attack at all; it's just a nice way to "obfuscate" a small number.

Source Link
Eric Lippert
  • 15k
  • 4
  • 39
  • 57

There are numerous problems here.

  • There are 26^10 vouchers in the voucher space, but only 2^32 possible seeds to the random number generator. And the random number generator has a period of a few billion before it repeats. So most of your space cannot be generated. But that's not so bad. There are plenty of numbers which can be generated.

  • The random number generator on the thread is seeded with the current time. Which means that two threads that start at the same millisecond will produce the same sequence of vouchers forever; it is entirely possible that one of the threads will never succeed in producing a voucher that isn't already in the database. If you think that it is impossible for two threads to start in the same millisecond, I remind you that a millisecond is one million nanoseconds, and computers work on the nanosecond scale. There is plenty of opportunity for collision here. But maybe you can live with that.

  • The really horrible problem is: if I have access to one voucher I can run 2^32 simulations of your algorithm on my fast home machine and very quickly determine what the seed was for that instantiation of Random. I now have knowledge of every voucher that was generated by that thread in the past and the future. Do you want to give that information to everyone who has a single voucher? Basically once one voucher is public, the entire database is public.

  • Even if I don't have a voucher, I can run 2^32 simulations of different threads and generate probably vouchers, and try them out. One of them will be the right one.

This is a correctness and security nightmare waiting to happen.

The right way to do this is to generate crypto-strength vouchers. I leave it to you to work out how to do that.

Now let's suppose that you do. What is the probability of a collision given n unique vouchers already in the database? Obviously n / 26^10. If n is under a few million then obviously this is very small. But here's a better question: what is the probability of there being any collision at any time during the computation of those million vouchers? The chance of any collision rises extremely rapidly once the number of the vouchers in the database is on the order of the square root of the sample space. The square root of 26^10 is only 12 million! So you are very likely to get at least one voucher collision if you are putting around 12 million vouchers in the database.

If you want to make that chance of collision smaller, up the voucher size. Every time you add one more digit to the voucher, you multiply the amount of runway you have until a collision is likely by five.

If this subject interests you, see

https://blogs.msdn.microsoft.com/ericlippert/2010/03/22/socks-birthdays-and-hash-collisions/

for more discussion.