2

I have a set of objects states which is greater than I think it would be reasonable to thread or process at a 1:1 basis, let's say it looks like this

class SubState(object):
    def __init__(self): 
        self.stat_1 = None
        self.stat_2 = None
        self.list_1 = []

class State(object): 
    def __init__(self): 
        self.my_sub_states = {'a': SubState(), 'b': SubState(), 'c': SubState()}

What I'd like to do is to make each of the sub_states to the self.my_sub_states keys shared, and simply access them by grabbing a single lock for the entire sub-state - i.e. self.locks={'a': multiprocessing.Lock(), 'b': multiprocessing.Lock() etc. and then release it when I'm done. Is there a class I can inherit to share an entire SubState object with a single Lock?

The actually process workers would be pulling tasks from a queue (I can't pass the sub_states as args into the process because they don't know which sub_state they need until they get the next task).


Edit: also I'd prefer not to use a manager - manager's are atrociously slow (I haven't done the benchmarks but I'm inclined to think an in memory database would work faster than a manager if it came down to it).

2
  • Just to clarify, you want to have unrestricted access to self.locks (so everyone can read from it in parallel) and a single lock around self.sub_states to prevent concurrent writing or no lock around self.sub_states either? Commented Feb 3, 2015 at 16:34
  • @SeanVieira Yes, unless I'm incorrect child processes (workers) will inherit self.locks as shared-memory copy from the parent (actually not sure how multiprocessing.Locks() is implemented at the low level). I don't have any good ideas for how to share the SubState objects though. Commented Feb 3, 2015 at 16:39

1 Answer 1

2

As the multiprocessing docs state, you've really only got two options for actually sharing state between multiprocessing.Process instances (at least without going to third-party options - e.g. redis):

  1. Use a Manager
  2. Use multiprocessing.sharedctypes

A Manager will allow you to share pure Python objects, but as you pointed out, both read and write access to objects being shared this way is quite slow.

multiprocessing.sharedctypes will use actual shared memory, but you're limited to sharing ctypes objects. So you'd need to convert your SubState object to a ctypes.Struct. Also of note is that each multiprocessing.sharedctypes object has its own lock built-in, so you can synchronize access to each object by taking that lock explicitly before operating on it.

Sign up to request clarification or add additional context in comments.

3 Comments

Is there an effective way to pass reference to newly created ctypes (it sounds like I'll have a signaling problem then too - one could use a manager or a shared hash table to keep track of all of the shared c.structs but it's not too pretty) I'm really leaning towards Redis - and rewriting parts of the code in C if Redis proves too slow.
@user3467349 Just to clarify, you need to create and share new objects after child processes have already been created?
Yeah, I mentioned that - but not very clearly in the question it seems, unless I change some not very-simple things structurally that would be the case (while it's trivial to assign sub-states to specific workers - it's not trivial to keep some workers from idling, if they don't work out of a single task queue (I'm processing a stream).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.