Stacks, Frames and GDB

One of the things I have been working on in the Python API (hopefully for 7.5) is frame-printing and filtering in GDB.   This is where we allow customization when printing a frame.  The most common example of this is where the user wants a “backtrace”: GDB prints each stack up to the point of where the program stopped executing.

What does this mean, what is  frame-printing and filtering anyway?

A little history.  This is not a new concept.  This capability already exists in the Fedora shipped versions of GDB. It is, however, written as a number of Python wrappers, and is a utility rather than “true” integration with GDB. Because of that you cannot customize existing GDB commands (like backtrace, for example).

What we want to do it tightly integrate it with GDB internals, so that in every instance that GDB needs to print a frame, the user can intercept and customize this action.

What is frame-printing?

Each frame is comprised of a number of individual components, each of which come together to form the printed frame.  Take this snippet of an example backtrace from GDB:


#0  0x00000038ebce6ab8 in poll () from /lib64/libc.so.6
#1  0x00000000005952bb in gdb_wait_for_event (block=1) at ../../archer/gdb/event-loop.c:863
#2  0x000000000059559a in gdb_do_one_event () at ../../archer/gdb/event-loop.c:461
#3  0x0000000000595735 in start_event_loop () at ../../archer/gdb/event-loop.c:490

That is a fairly typical backtrace. In that example there is:  the frame number,  address,  function name,  arguments, and location in the source. This is how GDB prints each frame. Currently, you can modify how GDB prints each frame with a few modifiers such as: “full”, but it is fairly limited. For frame printing, there are two aspects we want to allow:

  • Allow customization of each element in the frame.

This means calling a Python object each time a frame is ready to be printed.  This will be similar to how value printers currently work.  We keep three registration collections: one list for the current object-file, one for the current program-space, and one global list.  Frame formatters/filters self-register in whatever lists are appropriate.  When GDB is ready to print a frame, that frame is passed to each element in the list until one shows interest in printing.  If there are no interested objects, GDB prints it the old fashioned way.

When a frame formatter/filter object shows interest in that frame we call several methods in that object.  The GDB frame is always passed to the Python object as a point of reference, so it may interrogate the current frame’s data.  We don’t really care how the object is constructed, just as long as we can call several methods that the objects contains.  So for the “function” element, we could call the method in the object that describes the “function” in the backtrace.  Each element in the frame that needs to be printed, will have a corresponding method call.  This allows the user to customize that data presented in the backtrace.

  • Allow ad-hoc options

The next problem is if we want to allow customization to printing, we have to somehow allow ad-hoc options to be specified to existing in-built GDB commands like “backtrace”.  This also has bearing on the second part of the frame filtering aspect as well.  But for now, allowing script writers to take options beyond the usual inbuilt options is important too.

What is frame-filtering?

Currently backtraces are printed in a sequential fashion.  They have been printed in this way since pretty much GDB was first written (as well as, I hazard to guess, most debuggers).  This seemed to be the most straightforward and useful way to present the flow of execution to a user.  But, if you think about it, that is not always necessarily the case.  There are a lot of frames, and data, that the user has no interest in,  or, there are several frames that present data contextually that is more useful as one frame.  Anyone who hacks on Python C API knows this ;)

The concept of frame-filtering is best described as “replacement” synthetic frames, and “replaced” frames.  For example take a scripting language. Currently there may be three frames that describe one atomic action in the scripting side of the language:

  • frame 1
  • frame 2
  • frame 3

These are useful from a contextual view of how the interpreter is preparing and constructing that one script-side operation, but not very useful from a presentation perspective of what that one operation really is.  What if we could organize that a little  better?  With frame-filters we allow you to create the concept of a “replacement” synthetic frame,  and “replaced” children:

  • synthetic frame (ie does not really exist)
    • replaced frame 1
    • replaced frame 2
    • replaced frame 3

So in that example,  the user, via a frame-formatter object, has created a synthetic frame that describes what the interpreter is really doing, but we also include the original composite frames that make up that synthetic frame.

 

Conclusion

There are a number of questions that present themselves with this concept.

  • How do you represent these to MI and Annotations in GDB?
  • Should frame filters/formatters also run on replaced frames?
  • Should we allow the the user to omit replaced frames?
  • How do we number synthetic frames, and replaced frames so that we still honor backtraces that are bound (ie bt 20)?
  • Should we allow frames to be omitted completely?
  • Should value pretty-printers be omitted from the the data printed out in “args” and “locals”, and vice-versa, should frame filters even be passed the value?  (IE, should frame filters be only allowed the manipulate the arg/variable name, but not the content.

I hope to answer these questions in the future, in a future series of blog articles.  As always if you have any feedback, just email me or leave a comment.

 

Phil

 

One single comment

  1. Recovery from leading embed hinges on the actual procedure and also the individuals overall wellness.

Post a comment

Copyright © Phil Muldoon

Built on Notes Blog Core
Powered by WordPress