Category GDB

Stacks, Frames and GDB

One of the things I have been working on in the Python API (hopefully for 7.5) is frame-printing and filtering in GDB.   This is where we allow customization when printing a frame.  The most common example of this is where the user wants a “backtrace”: GDB prints each stack up to the point of where the program stopped executing.

What does this mean, what is  frame-printing and filtering anyway?

A little history.  This is not a new concept.  This capability already exists in the Fedora shipped versions of GDB. It is, however, written as a number of Python wrappers, and is a utility rather than “true” integration with GDB. Because of that you cannot customize existing GDB commands (like backtrace, for example).

What we want to do it tightly integrate it with GDB internals, so that in every instance that GDB needs to print a frame, the user can intercept and customize this action.

What is frame-printing?

Each frame is comprised of a number of individual components, each of which come together to form the printed frame.  Take this snippet of an example backtrace from GDB:


#0  0x00000038ebce6ab8 in poll () from /lib64/libc.so.6
#1  0x00000000005952bb in gdb_wait_for_event (block=1) at ../../archer/gdb/event-loop.c:863
#2  0x000000000059559a in gdb_do_one_event () at ../../archer/gdb/event-loop.c:461
#3  0x0000000000595735 in start_event_loop () at ../../archer/gdb/event-loop.c:490

That is a fairly typical backtrace. In that example there is:  the frame number,  address,  function name,  arguments, and location in the source. This is how GDB prints each frame. Currently, you can modify how GDB prints each frame with a few modifiers such as: “full”, but it is fairly limited. For frame printing, there are two aspects we want to allow:

  • Allow customization of each element in the frame.

This means calling a Python object each time a frame is ready to be printed.  This will be similar to how value printers currently work.  We keep three registration collections: one list for the current object-file, one for the current program-space, and one global list.  Frame formatters/filters self-register in whatever lists are appropriate.  When GDB is ready to print a frame, that frame is passed to each element in the list until one shows interest in printing.  If there are no interested objects, GDB prints it the old fashioned way.

When a frame formatter/filter object shows interest in that frame we call several methods in that object.  The GDB frame is always passed to the Python object as a point of reference, so it may interrogate the current frame’s data.  We don’t really care how the object is constructed, just as long as we can call several methods that the objects contains.  So for the “function” element, we could call the method in the object that describes the “function” in the backtrace.  Each element in the frame that needs to be printed, will have a corresponding method call.  This allows the user to customize that data presented in the backtrace.

  • Allow ad-hoc options

The next problem is if we want to allow customization to printing, we have to somehow allow ad-hoc options to be specified to existing in-built GDB commands like “backtrace”.  This also has bearing on the second part of the frame filtering aspect as well.  But for now, allowing script writers to take options beyond the usual inbuilt options is important too.

What is frame-filtering?

Currently backtraces are printed in a sequential fashion.  They have been printed in this way since pretty much GDB was first written (as well as, I hazard to guess, most debuggers).  This seemed to be the most straightforward and useful way to present the flow of execution to a user.  But, if you think about it, that is not always necessarily the case.  There are a lot of frames, and data, that the user has no interest in,  or, there are several frames that present data contextually that is more useful as one frame.  Anyone who hacks on Python C API knows this ;)

The concept of frame-filtering is best described as “replacement” synthetic frames, and “replaced” frames.  For example take a scripting language. Currently there may be three frames that describe one atomic action in the scripting side of the language:

  • frame 1
  • frame 2
  • frame 3

These are useful from a contextual view of how the interpreter is preparing and constructing that one script-side operation, but not very useful from a presentation perspective of what that one operation really is.  What if we could organize that a little  better?  With frame-filters we allow you to create the concept of a “replacement” synthetic frame,  and “replaced” children:

  • synthetic frame (ie does not really exist)
    • replaced frame 1
    • replaced frame 2
    • replaced frame 3

So in that example,  the user, via a frame-formatter object, has created a synthetic frame that describes what the interpreter is really doing, but we also include the original composite frames that make up that synthetic frame.

 

Conclusion

There are a number of questions that present themselves with this concept.

  • How do you represent these to MI and Annotations in GDB?
  • Should frame filters/formatters also run on replaced frames?
  • Should we allow the the user to omit replaced frames?
  • How do we number synthetic frames, and replaced frames so that we still honor backtraces that are bound (ie bt 20)?
  • Should we allow frames to be omitted completely?
  • Should value pretty-printers be omitted from the the data printed out in “args” and “locals”, and vice-versa, should frame filters even be passed the value?  (IE, should frame filters be only allowed the manipulate the arg/variable name, but not the content.

I hope to answer these questions in the future, in a future series of blog articles.  As always if you have any feedback, just email me or leave a comment.

 

Phil

 

GDB Python Scripting

It’s been quite some time since I last wrote something.  I’ve been busy working on Project Archer and GDB – mainly on the GDB Python scripting support.  My big goal at the moment is to move all of of the existing Python scripting API from the Archer git repository to the FSF CVS repository.  This of course means public code reviews.  This is always sometimes a little nerve-racking – even after years of doing it. But … so far, so good!  GDB is very responsive to reviews, and to date everything has gone great.  Hopefully this means we’ll get done by 7.2.  But there is lots left to do, and bugs to fix (not even mentioning new features at some point).  Do you use the Python GDB support? Do you have pretty-printers written using the API? I’d like to know!

GDB and GCC

I’ve been working on Project Archer for some months now, and it has been pretty interesting. It has also been challenging. There are several deep dark wells of technical knowledge that I’ve had to explore in detail: unwinding, dwarf, debuginfo, and exceptions (generation, handling and personality routines). So I’ve been reading about, and stepping through a lot of these areas in GDB this last week. When does a program grow so big that one mortal human cannot work on its entirety? I don’t know the metric, but I bet GDB surpasses it.

As I’ve worked on improved C++ exception handling in GDB, it occurred to me that the different bugs I’ve filed could ultimately be put in one: “Make GDB work better with the GCC unwinder.” As GCC has changed in some areas, GDB has not changed in tandem with GCC. The next or finish commands relying purely on longjmp breakpoints is an example. (If you “next” over a C++ “throw” statement in GDB you will lose control of the inferior. GDB sets a “longjmp” breakpoint via the “next” command code to re-establish control – but the unwinder for C++ does not use setjmp/longjmp semantics to switch context. Once resumed, the inferior won’t stop at all, or where expected)

So this is a problem. It really irritates me when I lose control of an inferior when debugging. The pain is in proportion to the length of the debugging session. Sometimes I spend hours stepping a process. I’ve cursed a good line on several occasions where this has happened

It’s easy to see this negatively, and even easier to write a negative thing about it. But it is a fact of life. So what’s the problem? Well in most areas the longjmp trick will work. It won’t for C++ exceptions. But this grey area really bothers me. What if there are other areas where expectations do not match? Both GCC and GDB are highly complex programs. They change all the time, and where there is no direct transactional specification (ie debuginfo is written to a specification, so are elf binaries, and so on) the assumptions about how GCC generates code will eventually break. If they break in a big way, they will be fixed – and quickly. But if they break in minor little ways, then the user experience dies as a result of a thousand tiny paper cuts. Or a thousand tiny curses.

Project Archer

I’ve started working on Project Archer with a few other hackers. The purpose is to improve C++ debugging with GDB. Under review is the Roadmap and the Development process. If you want to get involved either as a hacker, commentator, tester or are just generally interested, come find us on the Mailing list.

Copyright © Phil Muldoon

Built on Notes Blog Core
Powered by WordPress