Mocking the unmockable
Mocking is often challenging. Here is a situation I always tried to work around instead of solving it. We've got a function that is called on importing a module.
The Problem
Sometimes there is a need to call a function in the import stage. It might be connecting to a database or retrieving secure data from KMS. I've created an example scenario. To replicate it, create a project with poetry
➜ Medium > poetry new mocking_problem
➜ Medium > cd mocking_problem
Then add the following lib
module with a test. The get_data()
checks for the CACHED_DATA
environment variable and returns its contents. Otherwise, it raises a NonImplementedError
to emphasize the code we don't want to run in the tests.
# mocking_problem/lib.py
import json
import os
def get_data():
if os.environ.get("CACHED_DATA"):
return json.loads(os.environ["CACHED_DATA"])
raise NotImplementedError()
# tests/test_lib.py
from unittest import mock
import pytest
from mocking_problem.lib import get_data
def test_get_data_raises():
with pytest.raises(NotImplementedError):
get_data()
@mock.patch("mocking_problem.lib.os")
def test_get_data_from_environment(mock_os):
mock_os.environ = {"CACHED_DATA": '{"cached": "data"}'}
assert get_data() == {"cached": "data"}
The action
module depends on the lib
. It calls the get_data()
and then uses received data in the use_data()
method.
# mocking_problem/action.py
from logging import getLogger
from mocking_problem.lib import get_data
logger = getLogger()
def use_data(data):
logger.info(data)
received_data = get_data()
use_data(received_data)
# tests/test_action.py
from unittest import mock
from mocking_problem.action import use_data
@mock.patch("mocking_problem.action.logger")
def test_use_data(mock_logger):
use_data({"some": "data"})
mock_logger.info.assert_called_with({"some": "data"})
If we try to run pytest
, it would fail because the get_data()
is called when use_data()
method is imported from the mocking_problem.action
module.
➜ mocking_problem > poetry run pytest
============================ test session starts ============================
platform darwin -- Python 3.11.2, pytest-7.2.2, pluggy-1.0.0
rootdir: /Users/zalun/Projects/Medium/mocking_problem
collected 1 item / 1 error
================================== ERRORS ===================================
___________________ ERROR collecting tests/test_action.py ___________________
tests/test_action.py:2: in <module>
from mocking_problem.action import use_data
mocking_problem/action.py:11: in <module>
received_data = get_data()
mocking_problem/lib.py:8: in get_data
raise NotImplementedError()
E NotImplementedError
========================== short test summary info ==========================
ERROR tests/test_action.py - NotImplementedError
!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!
============================= 1 error in 0.04s ==============================
The Workaround
The usual workaround might be to provide the CACHED_DATA
environment variable for the test. We can do it using a very helpful pytest-dotenv
package.
➜ mocking_problem poetry > add --group dev pytest-dotenv
# pytest.ini
[pytest]
env_files =
.test.env
# .test.env
CACHED_DATA='{"default": "data"}'
Now the failing test is in the test_lib.py
as we no longer raise the NotImplementedError
. The CACHED_DATA
environment variable is there for all the tests.
➜ mocking_problem > poetry run pytest
============================ test session starts ============================
platform darwin -- Python 3.11.2, pytest-7.2.2, pluggy-1.0.0
rootdir: /Users/zalun/Projects/Medium/mocking_problem, configfile: pytest.ini
plugins: dotenv-0.5.2
collected 2 items
tests/test_action.py . [ 33%]
tests/test_lib.py F. [100%]
================================= FAILURES ==================================
___________________________ test_get_data_raises ____________________________
def test_get_data_raises():
> with pytest.raises(NotImplementedError):
E Failed: DID NOT RAISE <class 'NotImplementedError'>
tests/test_lib.py:7: Failed
========================== short test summary info ==========================
FAILED tests/test_lib.py::test_get_data_raises - Failed: DID NOT RAISE <class 'NotImplementedError'>
======================== 1 failed, 1 passed in 0.03s ========================
We can mock the os in test_get_data_raises()
as well and change os.environ.get()
to return None.
# tests/test_lib.py
from unittest import mock
import pytest
from mocking_problem.lib import get_data
@mock.patch("mocking_problem.lib.os")
def test_get_data_raises(mock_os):
mock_os.environ.get.return_value = None
with pytest.raises(NotImplementedError):
get_data()
@mock.patch("mocking_problem.lib.os")
def test_get_data_from_environment(mock_os):
mock_os.environ = {"CACHED_DATA": '{"cached": "data"}'}
assert get_data() == {"cached": "data"}
Tests are working, and we can drink in the company relaxing chair.
➜ mocking_problem > poetry run pytest
============================ test session starts ============================
platform darwin -- Python 3.11.2, pytest-7.2.2, pluggy-1.0.0
rootdir: /Users/zalun/Projects/Medium/mocking_problem, configfile: pytest.ini
plugins: dotenv-0.5.2
collected 2 items
tests/test_action.py . [ 33%]
tests/test_lib.py .. [100%]
============================= 3 passed in 0.02s =============================
Only the problem hasn't gone away, the get_data()
function is still called on import, and we're unable to test if it did.
Solution 1
We can rewrite the test_action.py
and make it use the mocked get_data()
on import. We will patch the sys.modules
with a mocked lib
module. We need to patch it with the logging
module as we use it also on import.
# tests/test_action.py
from unittest import mock
import sys
# Prepare mocked lib module
mock_lib = mock.Mock()
mock_lib.get_data.return_value = {"received": "data"}
# Mock sys.modules with the mocked lib and logger
with mock.patch.dict(
sys.modules, **{"mocking_problem.lib": mock_lib, "logging": mock.Mock()}
):
from mocking_problem.action import (
received_data,
use_data,
logger as mock_logger,
)
from mocking_problem.lib import get_data as mock_get_data
def test_action_get_data_is_called_on_import():
assert received_data == {"received": "data"}
mock_get_data.assert_called_once()
def test_action_use_data_is_called_on_import():
mock_logger.info.assert_called_once_with({"received": "data"})
def test_use_data_calls_logger():
mock_logger.info.reset_mock()
use_data({"some": "data"})
mock_logger.info.assert_called_once_with({"some": "data"})
It is a good enough solution. We can test the effect of calling use_data()
on import, but we can't check if it was actually called — the test would still pass if someone would change the code and log the received data directly in the module.
Solution 2 (with refactoring the action
module)
We want to mock the get_data()
and use_data()
, import the action.py
module, and check if both were called.
This time we will refactor the action.py
module and move use_data()
into another file. For simplicity, we will place it in the lib.py:
# mocking_problem/lib.py
import json
import os
from logging import getLogger
logger = getLogger()
def get_data():
os.environ["CACHED_DATA"]
if os.environ.get("CACHED_DATA"):
return json.loads(os.environ["CACHED_DATA"])
raise NotImplementedError()
def use_data(data):
logger.info(data)
And the action.py
will import it along with get_data()
:
# mocking_problem/action.py
from mocking_problem.lib import get_data, use_data
received_data = get_data()
use_data(received_data)
The tests will change, and testing use_data()
will now happen in the test_lib.py
. We no longer need the test_action.py
for that purpose.
# tests/test_lib.py
import pytest
from unittest import mock
from mocking_problem.lib import get_data, use_data
@mock.patch("mocking_problem.lib.os")
def test_get_data_raises(mock_os):
mock_os.environ.get.return_value = None
with pytest.raises(NotImplementedError):
get_data()
@mock.patch("mocking_problem.lib.os")
def test_get_data_from_environment(mock_os):
mock_os.environ = {"CACHED_DATA": '{"cached": "data"}'}
assert get_data() == {"cached": "data"}
@mock.patch("mocking_problem.lib.logger")
def test_use_data(mock_logger):
use_data({"some": "data"})
mock_logger.info.assert_called_with({"some": "data"})
The tests are working, but we still need to find out if functions in action.py
are called in the right way on import.
We will patch the sys.modules
with a mocked lib
module. Imports in action
will now use the mocked version.
# tests/test_action.py
from unittest import mock
import sys
mock_lib = mock.Mock()
# We need to set the return_value before importing the action module.
mock_lib.get_data.return_value = {"received": "data"}
with mock.patch.dict(sys.modules, **{"mocking_problem.lib": mock_lib}):
from mocking_problem.action import received_data
from mocking_problem.lib import (
get_data as mock_get_data,
use_data as mock_use_data,
)
def test_action_received_data_instantiated_with_mocked_value():
assert received_data == {"received": "data"}
def test_action_get_data_is_called_on_import():
mock_get_data.assert_called_once()
def test_action_use_data_is_called_on_import():
mock_use_data.assert_called_once_with({"received": "data"})
Tests passed. We are testing all the functions. We're seeing the right functions called with the right values on importing the actions module.
➜ mocking_problem > poetry run pytest
============================ test session starts ============================
platform darwin -- Python 3.11.2, pytest-7.2.2, pluggy-1.0.0
rootdir: /Users/zalun/Projects/Medium/mocking_problem, configfile: pytest.ini
plugins: dotenv-0.5.2
collected 6 items
tests/test_action.py ... [ 50%]
tests/test_lib.py ... [100%]
============================= 6 passed in 0.03s =============================
Next Steps
It is possible to modify the solution in multiple ways. There might be a need to leave some of the methods from the lib
in the original state. In that case instead of creating the module as a mock:
# tests/test_action.py
# ...
mock_lib = mock.Mock()
mock_lib.get_data.return_value = {"received": "data"}
# ...
We can import it and replace some of the methods with mocks leaving the rest of the test untouched:
# tests/test_action.py
# ...
import mocking_problem.lib as lib
lib.get_data = mock.Mock()
lib.get_data.return_value = {"received": "data"}
lib.use_data = mock.Mock()
# ...