Tim\'s picture      Blogging Ottinger (tim)

2007-June-1

C++ Core Dump Most Puzzling.

Filed under: Programming

I’m working with an unusually bad bit of code for a customer (over 1200 lines in .h, over 3200 in .cpp, and cpp files imports over 200 other files, over a 15-minute compile) and trying to get it into a testable state so we can complete a project and do some real refactoring. Some obvious things done, such as splitting up the .cpp into two files so that the parts I’m working on can compile and link quickly.

When I get to the next obvious thing — making a function virtual so I can stub it in a derived test class — the craziness starts. Apparently the file has broken my g++ compiler so that I cannot do virtual dispatch. I can’t post the code here, and don’t own it so I can’t release it, but check this out (names changed to protect the innocent):

  CrazyClass c;           // Okay
  c.virtualFunction();    // Okay (not really virtual dispatch)
  CrazyClass *p = &c;     // Okay.
  p->virtualFunction();   // Core dump, never arrives in function

This has been checked with gdb/ddd on the core file, in live runs, etc.

The crazy thing is that CrazyClass derives from nothing, and has no derivatives. Just putting the word ‘virtual’ on the front of the function causes the seg fault & core dump. It also has no overloaded operators, and virtualFunction() and the destructor are the only virtual functions period. The seg fault happens as well if I call a non-virtual function that calls the virtualFunction().

Crazier yet is that virtualFunction is really just a getter. It returns the value of a member variable, sort of like:

class CrazyClass {
private:
    int x;
public:
    virtual virtualFunction(); // { return this->x; } in .cpp file
}

I added a print to the getter, and it prints nicely if you’re not using a pointer, never prints if you do use a pointer. In debug, it never arrives at the print statement. It doesn’t get to the function at all, apparently. It looks like the vtable is bad.

I dumped the symbol table on the .o and didn’t see anything obviously bad. I checked the symbol table on the two .cpp files, and saw that there was only one vtable. I put both the virtual functions into the same file, on the off chance that the compiler were so naive that it would not link up the methods if they were in separate .o files (a ridiculous thought, but it’s a ridiculous bug).

I figure that this big ugly file must have broken my compiler in a very obscure way if it doesn’t fail at compile time or at link time, and is still wrong wrong wrong. I want to fix it, and have tried a lot of crazy things with no success. If you know of any bugs that are likely to manifest this way, I am all ears.

2 Comments »

The URI to TrackBack this entry is: http://tottinge.blogsome.com/2007/06/01/c-core-dump-most-puzzling/trackback/

  1. Here are some idle thoughts. Sorry if these have a “are you sure it’s plugged in?” vibe.

    Precompiled headers are more of a Windows thing, but they can be done with gcc. That used to always be the first thing I’d kill when I ran into something like this. Check to make sure your project isn’t insisting on precompiled headers.

    Don’t assume that the deps are quite right in the build system. Try completely cleaning the build tree, and rebuilding from scratch, after adding the “virtual” keyword. I would certainly expect to see segfaults if a stray .o didn’t get rebuilt after changing the virtuality of a class. (Though I probably would have expected a build failure, first; isn’t virtual status represented in the mangled name?)

    Grep for CrazyClass. I’ve seen it before where a class was defined twice, in two different headers, with the usual fun and games if the two definitions get out of sync with each other or the implementation.

    Comment by Jeff Licquia — 2007-June-1 @ 07:32

  2. Thanks for the tips, Jeff.

    I didn’t find a multiple-def, and we had done clean builds, and we don’t use any precompiled headers for this project though I have considered it. It is so slow, anything would be an improvement (including getting off of net mounts for our builds).

    If we can get this thing under test, there is a lot of refactoring on the way.

    Comment by Tim — 2007-June-1 @ 09:25

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>



Anti-spam measure: please retype the above text into the box provided.

Get free blog up and running in minutes with Blogsome | Theme designs available here