An odd test failure | Colin Watson's blog
source link: https://www.chiark.greenend.org.uk/~cjwatson/blog/odd-test-failure.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Dec 19 2017 An odd test failure
Weird test failures are great at teaching you things that you didn’t realise you might need to know.
As previously mentioned, I’ve been working on converting Launchpad from Buildout to virtualenv and pip, and I finally landed that change on our development branch today. The final landing was mostly quite smooth, except for one test failure on our buildbot that I hadn’t seen before:
ERROR: lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked worker ID: unknown worker (bug in our subunit output?) ---------------------------------------------------------------------- Traceback (most recent call last): _StringException: log: {{{ 36.384 creating repository in file:///tmp/testbzr-6CwSLV.tmp/lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked/work/stacked-on/.bzr/. 36.388 creating branch <bzrlib.branch.BzrBranchFormat7 object at 0xeb85b36c> in file:///tmp/testbzr-6CwSLV.tmp/lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked/work/stacked-on/ }}} Traceback (most recent call last): File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/lib/lp/codehosting/codeimport/tests/test_worker.py", line 1108, in test_stacked stacked_on.fetch(Branch.open(source_details.url)) File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/branch.py", line 186, in open possible_transports=possible_transports, _unsupported=_unsupported) File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 689, in open _unsupported=_unsupported) File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 718, in open_from_transport find_format, transport, redirected) File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/transport/__init__.py", line 1719, in do_catching_redirections return action(transport) File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 706, in find_format probers=probers) File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 1155, in find_format raise errors.NotBranchError(path=transport.base) NotBranchError: Not a branch: "/tmp/tmpdwqrc6/trunk/".
When I investigated this locally, I found that I could reproduce it if I ran just that test on its own, but not if I ran it together with the other tests in the same class. That’s certainly my favourite way round for test isolation failures to present themselves (it’s more usual to find state from one test leaking out and causing another one to fail, which can make for a very time-consuming exercise of trying to find the critical combination), but it’s still pretty odd.
I stepped through the Branch.open
call in each case in the hope of some
enlightenment. The interesting difference was that the custom probers
installed by the bzr-svn
plugin weren’t installed when I ran that one test
on its own, so it was trying to open a branch as a Bazaar branch rather than
using the foreign-branch logic for Subversion, and this presumably depended
on some configuration that only some tests put in place. I was on the verge
of just explicitly setting up that plugin in the test suite’s setUp
method, but I was still curious about exactly what was breaking this.
Launchpad installs several Bazaar plugins, and
lib/lp/codehosting/__init__.py
is responsible for putting most of these in
place: anything in Launchpad itself that uses Bazaar is generally supposed
to do something like import lp.codehosting
to set everything up. I
therefore put a breakpoint at the top of lp.codehosting
and stepped
through it to see whether anything was going wrong in the initial setup.
Sure enough, I found that bzrlib.plugins.svn
was failing to import due to
an exception raised by bzrlib.i18n.load_plugin_translations
, which was
being swallowed silently but meant that its custom probers weren’t being
installed. Here’s what that function looks like:
def load_plugin_translations(domain): """Load the translations for a specific plugin. :param domain: Gettext domain name (usually 'bzr-PLUGINNAME') """ locale_base = os.path.dirname( unicode(__file__, sys.getfilesystemencoding())) translation = install_translations(domain=domain, locale_base=locale_base) add_fallback(translation) return translation
In this case, sys.getfilesystemencoding
was returning None
, which isn’t
a valid encoding
argument to unicode
. But why would that be? It gave
me a sensible result when I ran it from a Python shell in this environment.
A bit of head-scratching later and it occurred to me to look at a backtrace:
(Pdb) bt /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(703)<module>() -> main() /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(694)main() -> execsitecustomize() /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(548)execsitecustomize() -> import sitecustomize /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/sitecustomize.py(7)<module>() -> lp_sitecustomize.main() /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp_sitecustomize.py(193)main() -> dont_wrap_bzr_branch_classes() /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp_sitecustomize.py(139)dont_wrap_bzr_branch_classes() -> import lp.codehosting > /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp/codehosting/__init__.py(54)<module>() -> load_plugins([_get_bzr_plugins_path()])
I wonder if there’s something interesting about being imported from a
sitecustomize
hook? Sure enough, when I went to look at Python for where
sys.getfilesystemencoding
is set up, I found this in Py_InitializeEx
:
if (!Py_NoSiteFlag) initsite(); /* Module site */ ... #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CODESET) /* On Unix, set the file system encoding according to the user's preference, if the CODESET names a well-known Python codec, and Py_FileSystemDefaultEncoding isn't initialized by other means. Also set the encoding of stdin and stdout if these are terminals, unless overridden. */ if (!overridden || !Py_FileSystemDefaultEncoding) { ... }
I moved this out of
sitecustomize,
and it’s working better now. But did you know that a sitecustomize
hook
can’t safely use anything that depends on sys.getfilesystemencoding
? I
certainly didn’t, until it bit me.
Posted by Colin Watson on 2017-12-19 in launchpad. Tags: launchpad, planet-debian, planet-ubuntu.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK