Posted

After learning that many lotteries now use software and hardware random number generators(RNG) and reading a story about Daniel Corriveau, I decided it would be fun to explore trying to figure the state of a RNG and predict values.

One of the most commonly used RNG algorithms today is the Mersenee Twister(MT) algorithm. On the wikipedia article on MT it mentions after observing 624 generated numbers it is possible to predict all future iterates. At first glance I didn’t see how this was possible because of the algorithm’s tempering function, but after reading comments on wikipedia the tempering function is bijective(reversible). If you’re looking for a good puzzle to solve, try writing your own reverse tempering function. The tempering function is defined as:

k = mt[state->mti];
k ^= (k >> 11); // Temper shift U
k ^= (k << 7) & 0x9d2c5680UL; // Temper shift S & TemperingMaskB
k ^= (k <> 18); // Temper shift T

My solution is broken up in to the different parts and is definitely not optimized but it works.

class MersenneReverser
{
  private const UInt32 TemperingMaskB = 0x9d2c5680;
  private const UInt32 TemperingMaskC = 0xefc60000;

  static UInt32 undoTemper(UInt32 y)
  {
    y = undoTemperShiftL(y);
    y = undoTemperShiftT(y);
    y = undoTemperShiftS(y);
    y = undoTemperShiftU(y);
    return y;
  }
  static UInt32 undoTemperShiftL(UInt32 y)
  {
    UInt32 last14 = y >> 18;
    UInt32 final = y ^ last14;
    return final;
  }
  static UInt32 undoTemperShiftT(UInt32 y)
  {
    UInt32 first17 = y << 15;
    UInt32 final = y ^ (first17 & TemperingMaskC);
    return final;
  }
  static UInt32 undoTemperShiftS(UInt32 y)
  {
    /*
     * This one also sucked to figure out, but now I think i could write
     * a general one.  This basically waterfalls and keeps restoring original
     * bits then shifts the values down and xors again to restore more bits
     * and keeps on doing it.
     */
    UInt32 a = y << 7;
    UInt32 b = y ^ (a & TemperingMaskB);
    UInt32 c = b << 7;
    UInt32 d = y ^ (c & TemperingMaskB); // now we have 14 of the original
    UInt32 e = d << 7;
    UInt32 f = y ^ (e & TemperingMaskB); // now we have 21 of the original
    UInt32 g = f << 7;
    UInt32 h = y ^ (g & TemperingMaskB); // now we have 28 of the original
    UInt32 i = h <> 11;
    UInt32 b = y ^ a;
    UInt32 c = b >> 11;
    UInt32 final = y ^ c;
    return final;
  }
}

So then after observing 624 values and using the reverse temper function it is possible fill in the 624 internal state values of the mersenne twister algorithm and predict all the future values. Unfortunately this doesn’t help in many ways for practical purposes, since you need all 624 values unaltered. Take for instance the lottery. First, you don’t actually know the random number. Second, most likely the random number was used to shuffle the numbers in a certain way and only a portion of the random number was used. Third, if the algorithm gets reseeded each time you don’t know your 624 consecutive values. Also, some of the newer versions of the MT generate two values and combine them to create a better random double.

Comments

  1. James Falero.

    Supposing I fed in my OWN data into those 624 values could I possibly derive a seed value for all my values?

    What I’m trying to do is share map data for a multiplayer game without having to transfer more than a seed number for maps, specifically terrain data. The problem I have with this is I then am forced to search through each seed value of a given PRNG for suitable maps, where I much rather make my own.

    The alternative is to break maps up into chunks generated by random seed entered into the given PRNG until I find acceptable chunks. Catalog accepted chunks, and simply transfer seeds for each chunk.

    If I can simply reverse the process of a PRNG however to derive a seed from my custom data that would be much more preferable.

    If you could kindly answer this I would greatly be indebted to you 🙂
    james#falero [@] hotmail [dot] com

    (remove spaces. remove square brackets. remove hash sign #)

    Reply
    • Brendan

      Unfortunately I don’t believe you could just send the seed value. You would also need to send the 624 PRNG “state” values. Without the state values the MT algorithm could start anywhere in the sequence of random numbers it produces. The period of MT(how long until it repeats itself) is extremely large(2^19937 − 1).

      Reply
  2. tomv

    By a “seed,” I assume you mean some one-word value that would be used to generate the 624 values for the Mersenne Twister. The problem with this approach is that the MT has 19937 bits of state, but the seed has 32. The MT state vector has 2^19937-1 possible values, but a one-word seed has 2^32 possible values. There is no possible way for a small seed to represent all of the different state values. By choosing a small seed, no matter how you use it, you limit yourself to a small subset of the MT’s possible states. The only way to use all of the MTs possible starting states is to with the same number of bits (or more).

    Reply
  3. Justin

    Hey there, I think this is really interesting! I am very interested in implementing this solution in java, and making it easier to untemper the values for predicting output. I know this post is old, and there isn’t a huge chance you’ll see this, but I’d love to chat some more about this.

    regards,

    Justin

    Reply
  4. arthur

    Didn’t you inverted the L and the T in your undoTemperShiftL and undoTemperShiftT procedures ?
    It looks to me that undoTemperShiftL actually undos temperShiftT, and undoTemperShiftT undos temperShiftL

    Reply
  5. PrzemekM

    IMHO, to invert, eg.

    x = k1 ^ k1>>12 & MASK,

    one does the simple

    k2 = x; // init, 12 top bits recovered, yep nothing to do with them obviously
    k2 = x ^ k2>>12 & MASK // 24 bits now recovered coz xor or actually “+ mod 2” is involutive
    k2 = x ^ k2>>12 & MASK // 36 bits now recovered, and 36>32, thus BREAK

    And please remember that you need a suitable number of steps (while loop, as you can see the method is iterative) to recover all bits.
    Also please note that “&” is exactly “times mod 2”, and as such it has grouping precedence over addition (xor), ie. first AND than XOR (first “times” then “plus”).
    Regards, Przemek

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *