Repeatable simulations without repeated boilerplate

data science
python
scipy
simulation
Published

February 8, 2020

If you’re maintaining a lot of Python functions that depend on having pseudorandom number generation — like in a discrete-event simulation — you probably want different random states for each consumer of randomness. As a concrete example, if you’re simulating the behavior of multiple users in a store and their arrival times and basket sizes can be modeled by certain probability distributions, you probably want a separate source of randomness for each simulated user.

Using a global generator, like the one backing the module methods in numpy.random or Python’s random, makes it difficult to seed your simulation appropriately and can also introduce implicit dependencies between the global parameters of the simulation (e.g., how many users are involved in a run of the simulation) and the local behavior of any particular user.

Once you’ve decided you need multiple sources of randomness, you’ll probably have a lot of code that looks something like this:

import random
import numpy as np

def somefunc(seed=None):
  if seed is None:
    seed = random.randrange(1 << 32)
    
  prng = np.random.RandomState(seed)

  while True:
    step_result = None
    
    # use prng to do something interesting 
    # as part of the simulation and assign 
    # it to step_result (omitted here) ...
    
    yield step_result

Initializing random number generators at the beginning of each function is not only repetitive, it’s also ugly and error-prone. The aesthetic and moral costs of this sort of boilerplate were weighing heavily on my conscience while I was writing a simulation earlier this week, but an easy solution lifted my spirits.

Python decorators are a natural way to generate a wrapper for our simulation functions that can automatically initialize a pseudorandom number generator if a seed is supplied (or create a seed if one isn’t). Here’s an example of how you could use a decorator in this way:

def makeprng(func):
  def call_with_prng(*args, prng=None, seed=None, **kwargs):
    if prng is None:
      if seed is None:
        seed = random.randrange(1 << 32)
   
      prng = np.random.RandomState(seed)
    return func(*args, prng=prng, seed=seed, **kwargs)
    
  return call_with_prng

@makeprng
def somefunc(seed=None, prng=None):

  while True:
    step_result = None
    
    # use prng to do something interesting 
    # as part of the simulation and assign 
    # it to step_result (omitted here) ...
    
    yield step_result

With the @makeprng annotation, somefunc will be replaced with the output of makeprng(somefunc), which is a function that generates a prng and passes it to somefunc before calling it. So if you invoke somefunc(seed=1234), it’ll construct a pseudorandom number generator seeded with 1234. If you invoke somefunc(), it’ll construct a pseudorandom number generator with an arbitrary seed.

Decorators are a convenient, low-overhead way to provide default values that must be constructed on demand for function parameters — and they make code that needs to create multiple streams of pseudorandom numbers much less painful to write and maintain.