The case of the teleported static lib functions

This bug came back to bite my teammates today even though I resolved it some time ago. Because its quite an interesting bug, I felt resolved to write it down. There is this particular well known middleware library that we use, and it came in a few modules which are packaged as static libs each. Let’s just call 2 of them modules Fundamental and Extras, because one provides some fundamental functionalities, and the other, extra and optional (to some people) functionalities. The latter has a dependency on the former. There are 2 Visual Studio (VS) solutions, one with just module Fundamental, and the other with both modules, each in their own VS project. (So, Fundamental.sln has Fundamental.vcxproj, while Extras.sln has both Fundamental.vcxproj and Extras.vcxproj)

On one particular platform, we added a slight modification (2 new initialisation functions) to module Fundamental, and none to module Extras. Then we compiled the module Fundamental from its standalone solution, link them to our code, without any warnings or errors. All is good. Then one day, we were forced to upgrade to a newer version of this middleware due to some technical requirements. So we applied the same changes, build everything fine, and run the project. Bam, it crashed after the module initialisation, with some nonsensical pointer values, which, obviously means our initialisation code isn’t working. But because the crash happens after the init function returns, and not in the init function itself, we can’t tell what’s wrong straight away. I can’t remember what happened when I step through the init function, other than the fact that it all seems fine.

Tried switching back to using the default init code, and all is well. We went through the usual steps of proofreading the code, check the documentation and change log, etc, and it all seems fine. Then we tried to apply our changes step by step. First, we copied the default init function initSDK() exactly to our customInitSDK(). Bam! Crash again. What?! Its byte for byte the exact same code as the working default one. Tried all kinds of funny things after that, for example, we renamed the function to something else, but yet still called it with customInitSDK, and the program could still link to the lib! Eventually, we guessed that the customInitSDK() wasn’t being linked to correctly. But why?

So I open up the compiler and linker documentation, and look through all the options available, and eventually settled on enabling the writing of linker map, which means the linker will report which object files the symbols are linked from. That confirms that our customInitSDK() wasn’t being linked to, even though I checked that it is in the static lib. Tried many other things, and in the end, I found that it finally worked when I compile both modules Fundamental and Extras using the VS solution with both modules. And what does the linker map says? It says that all the functions in the Fundamental module appeared inside the Extras module, and everything is being linked from the Extras’ static lib! What the?! Anyway, we left it as that since it works fine, We only knew what had happened, but not why it happened this way.

But this bug came back today because we were working in another branch, and we had not rebuilt the two modules’ static libs in this branch. And the poor guys working on it wasted quite a few hours trying to figure out what’s happening. They thought they had done everything correctly, and I had forgotten about this bug after so many months. It was only after they came and discussed with me for a while then I suddenly recalled it.

Takeaway #1 – The many compiler and linker options can be useful to help track down some errors.
Takeaway #2, and this is important – When any bug is resolved, especially strange ones like this, do document the cause and solution properly, and put it in a place where the next soul working on it has a chance of seeing it!

No Comments.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre lang="" line="" escaped="" cssfile="">