1

I have a number of python files containing unit tests in the directory /code/test In an effort to parallelize my test running without any code editing, I figured I could use GNU Parallel to run N files at a time against N databases. First I spin up the databases:

$ docker run -d --name redis-unittest-1 redis \
 && docker run -d --name mongo-unittest-1 mongo:3.2.10 \
 && docker run -d --name redis-unittest-2 redis \
 && docker run -d --name mongo-unittest-2 mongo:3.2.10 \
 && docker run -d --name redis-unittest-3 redis \
 && docker run -d --name mongo-unittest-3 mongo:3.2.10 \
 && docker run -d --name redis-unittest-4 redis \
 && docker run -d --name mongo-unittest-4 mongo:3.2.10 \
 && docker run -d --name redis-unittest-5 redis \
 && docker run -d --name mongo-unittest-5 mongo:3.2.10 \
 && docker run -d --name redis-unittest-6 redis \
 && docker run -d --name mongo-unittest-6 mongo:3.2.10 \
 && docker run -d --name redis-unittest-7 redis \
 && docker run -d --name mongo-unittest-7 mongo:3.2.10 \
 && docker run -d --name redis-unittest-8 redis \
 && docker run -d --name mongo-unittest-8 mongo:3.2.10

Then use find to grab all the test file names, and pipe that into Parallel.

docker run test_img find /code/test -name "test*.py" \
| parallel -j8 \
docker run --rm \
--link mongo-unittest-{%}:db --link redis-unittest-{%}:redis \
-v $(pwd)/test-reports:/code/test-reports \
test_img python /code/test/discover.py --file {}  

This all seems to go fine, but sometimes one of the files will fail like so:

Traceback (most recent call last):
  File "/code/test/server/testApplicationAPI.py", line 28, in setUp
    super(TestApplicationAPI, self).setUp()
  File "/code/test/server/tools/testutils.py", line 345, in setUp
    self.app = server.createApp(True)
  File "/code/server/util/rq/../../server.py", line 55, in createApp
    mongo = PyMongo(app)
  File "/usr/local/lib/python2.7/site-packages/flask_pymongo/__init__.py", line 97, in __init__
    self.init_app(app, config_prefix)
  File "/usr/local/lib/python2.7/site-packages/flask_pymongo/__init__.py", line 249, in init_app
    cx = connection_cls(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/pymongo/mongo_client.py", line 428, in __init__
    raise ConnectionFailure(str(e))
ConnectionFailure: [Errno -2] Name or service not known

I'm not sure how to begin troubleshooting this. Maybe something like strace to see what's happening in more detail? I've never really used that though. Any thoughts would be appreciated.

2
  • Does it happen if you do not run it in parallel? I.e. -j1. Commented Feb 15, 2017 at 0:00
  • No it does not. Commented Feb 15, 2017 at 14:31

1 Answer 1

0

It seems you have found a race condition bug in MongoDB. From the message it sounds like DNS being overloaded, but it might very well be something else.

You should now:

  • See if you can reproduce the error in the newest code from MongoDB (The bug might be well-known and fixed already).
  • See if you can make an MCVE https://stackoverflow.com/help/mcve. This is often very hard when the bug is a race condition - especially since you want the developers to be able to reproduce your exact situation. If you can provoke the error on a virtual machine from OsBoxes.org it is a good start. Varying the number of jobs in parallel and the number of cores on the virtual machine can help, too. Maybe some helpful people in the Mongo-community can guide you?
  • File a bug-report.

Nothing so far suggests a problem in GNU Parallel.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.