topjer

Posted on Jun 16

Introduction to monkey patching with pytest

#python #testing #pytest #tutorial

Not too long ago, I had a pair-programming session with a colleague where we understood how monkey-patching works. A technique both of us had not really used before.

In this article I want to share our findings. But I will not simply present you the final solution. Instead we will retrace the steps and incrementally improve our code. Hopefully, this will not only teach you how to use monkey-patching but also how to "work with" error messages.

The Problem

Assume we want to test a function like this:

# utils.py
from pathlib import Path

def read_file(name: str) -> int:
    complete_name = name + "_foo"
    name_path = Path(complete_name)
    with name_path.open('r') as file:
        content = file.read()

    return len(content)

The elephant in the room

The mentioned function comes with a big problem and that is, that it opens the file itself and thus performs interaction with the file system.
This makes testing much harder than it needs to be. Because we want to test the logic around it and not that Path.open is working properly. Yet in order to do so, we have to deal with it.

Now, one could go ahead and have the function actually read a file, but this leads to follow up problems like:

Will the file be cleaned up after the test?
Will it also be cleaned up if something fails?
What if there is already a file with the needed name?

As mentioned before, we do not actually want to test Path.open so why not take it out of the equation completely?

About monkeypatching

Definition of monkeypatching according to wikipedia

In computer programming, monkey patching is a technique used to dynamically update the behavior of a piece of code at run-time.

We will see how to use monkeypatching to replace functionality with mock objects that make testing easier.

First step

Let us assume the following folder structure

mp_test/
    utils.py
    testing.py

where utils.py contains the code from above. As a first step, lets try to do this:

# testing.py
from mp_test.utils import read_file

def test_read_file():
    value = read_file('test_file')

Running this test will give us the expected error

FAILED testing.py::test_read_file - FileNotFoundError: [Errno 2] No such file or directory: 'test_file_foo'

Based on the documentation on the monkeypatch fixture lets try something like this:

# testing.py
from mp_test.utils import read_file
import pathlib

def test_read_file(monkeypatch):
    monkeypatch.setattr(pathlib, "Path", lambda x: print(x))
    value = read_file('test_file')

What did happen here?

First of all we are using the monkeypatch fixture and use it to replace how Path is working in pathlib. For starters, I just want it to print the name of the path that was given to it.

But there is a problem, when we execute this code, the same error like before does occur.

# pytest output
FAILED testing.py::test_read_file - FileNotFoundError: [Errno 2] No such file or directory: 'test_file_foo'

So what went wrong here? Did we do something wrong? Is monkey-patching just a scheme which does not really work?

Turns out, the problem is what (or better where) we have patched. To understand that, we should take a look at this part of the unittest.mock documentation which is linked in the pytest documentation:

patch() works by (temporarily) changing the object that a name points to with another one. There can be many names pointing to any individual object, so for patching to work you must ensure that you patch the name used by the system under test.

The basic principle is that you patch where an object is looked up, which is not necessarily the same place as where it is defined.

If we look again at the content of utils.py we see that Path is also imported there. So in order for the patch to work, you have to also patch it there.

# testing.py
from mp_test.utils import read_file
import mp_test

def test_read_file(monkeypatch):
    monkeypatch.setattr(mp_test.utils, "Path", lambda x: print(x))
    value = read_file('test_file')

This code leads to some progress ... a different error message.

# pytest output
--------------------- Captured stdout call ------------------
test_file_foo
========== short test summary info ================
FAILED testing.py::test_read_file - AttributeError: 'NoneType' object has no attribute 'open'

This error is also to be expected because we replaced Path with the print function. That is the reason why we can see the file path in the output.
A consequence is that name_path is set to None because that is the return value of print. This leads to the above error, when the code tries to call the open function of name_path.

Now that we know how to replace functions, we can go to the next step.

Cooking up mock objects

Our goal is to create a MockPath class which replaces the real Path and this mock should just return the path name when the open method is invoked.
So let's try something like this:

# testing.py
import mp_test
from mp_test.utils import read_file

class MockPath:
    def __init__(self, path: str) -> None:
        self.path = path

    def open(self, type) -> str:
        return self.path

def test_read_file(monkeypatch):
    monkeypatch.setattr(mp_test.utils, "Path", MockPath)
    value = read_file('test_file')

This is a step in the right direction but unfortunately it gives us a - somewhat - cryptic error:

FAILED mp_test/testing.py::test_read_file - TypeError: 'str' object does not support the context manager protocol

But a closer look at utils.py reveals the problem.

The output of Path.open is used as a context manager which has to provide the read method.

This means that we need another mock which has the methods __enter__ and __exit__ in order to define a context and a method read which returns the filename. So we will end up with something like this:

# testing.py
import mp_test
from mp_test.utils import read_file

class MockReader:
    def __init__(self, content: str) -> None:
        self.content = content

    def read(self) -> str:
        return self.content

    def __enter__(self):
        return self

    def __exit__(self, *args):
        pass

class MockPath:
    def __init__(self, path: str) -> None:
        self.path = path

    def open(self, type) -> MockReader:
        return MockReader(self.path)

def test_read_file(monkeypatch):
    monkeypatch.setattr(mp_test.utils, "Path", MockPath)
    value = read_file('test_file')
    assert value == 13

We have added the MockReader class which can be initialized with some content and when calling the open method, it returns said content. Upon entering, it returns itself and it does nothing upon exit.

With these mocks in place, our test actually succeeds.

Finishing touches

There is actually - at least - one problem with the test setup. Can you spot it?

The way it is approached now, we are not checking whether Path is invoked with the correct path. So we might end up in a situation where not the right file is loaded.

In order to test that, we need a way to see with which arguments MockPath has been initialized. One possible way to do that would be to use class variables which could look like this:

# testing.py
class MockPath:
    paths: list[str] = []

    def __init__(self, path: str) -> None:
        self.path = path
        self.paths.append(path)

    def open(self, type) -> MockReader:
        return MockReader(self.path)

def test_read_file(monkeypatch):
    monkeypatch.setattr(mp_test.utils, "Path", MockPath)
    value = read_file('test_file')
    assert value == 13
    assert 'test_file_foo' == MockPath.paths[0]

The shortcomings of the example

Let me be honest with you. Instead of writing unit tests like this, I would not have written create_file like this to begin with. Instead, I took it, because it is an easy to understand example.

Usually, I try to avoid having functions that mix logic - determining the name, processing the content - with the code that gets data from an interface. Instead, I like putting all logic into isolated functions that take "primitive" Python data types as input. This makes testing really easy.

But there is a middle ground between those two extremes and that would be using abstraction.

For example, one could do the following thing in utils.py

# utils.py
from pathlib import Path

class Reader:
    def __init__(self, path, mode):
        self.path = path
        self.mode = mode

    def __enter__(self):
        self.stream = Path(self.path).open(self.mode)
        return self.stream

    def __exit__(self):
        self.stream.close()

class ReaderContext:
    def __init__(self, path: str):
        self.path = Path(path)

    def open(self, mode):
        return Reader(self.path, mode)


def read_file(name: str, context):
    complete_name = name + "_foo"
    ctx = context(complete_name)
    with ctx.open('r') as file:
        content = file.read()

    return len(content)

Some things have changed here. Let's take a closer look.

The biggest change is that we now have two classes Reader and ReaderContext.

The naming is questionable, I know, please ignore that for now and focus on what is happening. The ReaderContext is just a wrapper around the Readerclass which is again a wrapper around Path.open. Also note that the signature of the read_file function has changed. A context is now passed in which is called.

All in all, this should seem familiar to you because it is very similar to the setup of our Mock objects.
These objects already where an abstraction of the functionality that we need to execute read_file. In the code above we are only wrapping the Path.open method in this abstraction.

Now we can also simplify our test, because we no longer need to mock anything. Instead we can pass our mock classes directly to read_file:

# testing.py
def test_read_file():
    value = read_file('test_file', MockPath)
    assert value == 13
    assert 'test_file_foo' == MockPath.paths[0]

One last comment, the code could probably be simplified to the point where only one class is needed instead of two, but I intentionally mirrored what was already there in the test. So it is easier to understand. Feel free to simplify the code yourself as an exercise.

Next steps

There is one benefit that the more complex implementation brings with it and that is the fact that we are now in a position where we could write other Reader and ReaderContext classes that can handle different data sources.

It would be cleaner to define abstract base classes for both and derive every other class from those. But this is outside of the scope of what I wanted to achieve with this article, so I will leave that for another time.

Conclusion

I hope that you found this short introduction to monkey-patching in your tests helpful. The focus was to touch upon different concepts without going too deep on any on them.

Feedback is always appreciated and let me know whether you would like to read more about certain topics in the future.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.