Phil Muldoon Blog

GDB and GCC

Posted by Phillip Muldoon
On November 17th, 2008 at 14:11

Permalink | Trackback | Links In |

No Comments |
Posted in Frysk, GDB

I’ve been working on Project Archer for some months now, and it has been pretty interesting. It has also been challenging. There are several deep dark wells of technical knowledge that I’ve had to explore in detail: unwinding, dwarf, debuginfo, and exceptions (generation, handling and personality routines). So I’ve been reading about, and stepping through a lot of these areas in GDB this last week. When does a program grow so big that one mortal human cannot work on its entirety? I don’t know the metric, but I bet GDB surpasses it.

As I’ve worked on improved C++ exception handling in GDB, it occurred to me that the different bugs I’ve filed could ultimately be put in one: “Make GDB work better with the GCC unwinder.” As GCC has changed in some areas, GDB has not changed in tandem with GCC. The next or finish commands relying purely on longjmp breakpoints is an example. (If you “next” over a C++ “throw” statement in GDB you will lose control of the inferior. GDB sets a “longjmp” breakpoint via the “next” command code to re-establish control - but the unwinder for C++ does not use setjmp/longjmp semantics to switch context. Once resumed, the inferior won’t stop at all, or where expected)

So this is a problem. It really irritates me when I lose control of an inferior when debugging. The pain is in proportion to the length of the debugging session. Sometimes I spend hours stepping a process. I’ve cursed a good line on several occasions where this has happened

It’s easy to see this negatively, and even easier to write a negative thing about it. But it is a fact of life. So what’s the problem? Well in most areas the longjmp trick will work. It won’t for C++ exceptions. But this grey area really bothers me. What if there are other areas where expectations do not match? Both GCC and GDB are highly complex programs. They change all the time, and where there is no direct transactional specification (ie debuginfo is written to a specification, so are elf binaries, and so on) the assumptions about how GCC generates code will eventually break. If they break in a big way, they will be fixed - and quickly. But if they break in minor little ways, then the user experience dies as a result of a thousand tiny paper cuts. Or a thousand tiny curses.

Systemtap Editor home

Posted by Phillip Muldoon
On September 29th, 2008 at 14:09

Permalink | Trackback | Links In |

No Comments |
Posted in Eclipse, Systemtap

Thanks to all the emails, and suggestion regarding where to host the Systemtap Editor for Eclipse that I am hacking on. I ended up hosting it - under incubation - at the Eclipse Linux Distributions Project.

ViewVC of the subversion repository (ViewVC link)

The editor is under the Systemtap Module.

The danger of rainy weekends

Posted by Phillip Muldoon
On September 9th, 2008 at 10:09

Permalink | Trackback | Links In |

Comments (1) |
Posted in Eclipse, Systemtap

Besides Project Archer I have been mucking around with Systemtap. I’ve always had a bit of trouble writing Systemtap scripts - my brain is not big enough, or my practice high enough to write a comprehensive script without continually looking at the man-pages, language-reference guide or poking around in the Systemtap source. It makes for slow going sometimes.

A couple of days ago, I was chatting with  Frank and he mentioned that Systemtap can now generate coverage on Systemtap’s tapset library with:

stap -L tapset.*

I thought … hmmm.

I’ve been ittching to get back to some Eclipse hacking, and I’ve been waiting for something to come and scratch that itch.

I thought … hmmm.

It’s raining Saturday. “I’ll hack on this for a bit,” I thought.

hmmmmm ….

Well it ate up my whole weekend, but I hacked up a little Systemtap editor in Eclipse that offers syntax highlighting and probe completion.

Here is a view of the editor and completion:

Systemtap Editor Syscall Completion

And here is a screen-shot showing the completion window as we drill down through all the signal probes (in this case to signal.sy*)

Systemtep Editor Partial Signal Completion

I was very impressed with Eclipse, and how everything just worked on Fedora. It took about a day to get the completion, syntax highlighting and the engine-room work to generate the completion meta-data from Systemtap (have to do it dynamically and cache it). I’ll hack on this project as my “weekend” project - it is still pretty raw. I’ll put the plug-in and source up when I can work-out a place to host it.

Also, while I’m here I’d like to point you in the direction of another Systemtap UI. This has a different focus to what I hack on, and seems to concentrate more on execution. I am more focused on script development. It’s all good.

Project Archer

Posted by Phillip Muldoon
On August 3rd, 2008 at 11:08

Permalink | Trackback | Links In |

No Comments |
Posted in GDB

I’ve started working on Project Archer with a few other hackers. The purpose is to improve C++ debugging with GDB. Under review is the Roadmap and the Development process. If you want to get involved either as a hacker, commentator, tester or are just generally interested, come find us on the Mailing list.

Cool little Systemtap scriptlet

Posted by Phillip Muldoon
On July 17th, 2008 at 20:07

Permalink | Trackback | Links In |

No Comments |
Posted in Systemtap

One of the things I’ve always found hard to do via ptrace is system-based state. Watching all processes across a system for a behaviour “trend”.  This is difficult as ptrace is not really designed for that. Frysk tried to address this in a different way. But Systemtap does it in a very scriptable way.

So  …. lately, I’ve been writing a series of articles around Systemtap, and I was hacking up a little script. I found this little tiny scriptlet very useful. It is so simple as well - and child’s play for the experienced Systemtap hackers out there. It simply watches every process for fork/clone and exec. It prints the name and pid for the processes involved. It also watches for a process exec and prints the process name, pid and executable to exec. The actual heavy lifting is done in 6 lines of code, which I find remarkable.

#! /usr/bin/env stap

probe begin {
print ("Tracking process creations .... \n\n")
}

probe process.create {
printf("%s (%d) created %d\n", execname(), pid(), new_pid)
}

probe process.exec {
printf("%s (%d) is exec'ing %s\n", execname(), pid(), filename)
}

probe end {
print("All done!\n")
}

Example output. During this script run, I run thunderbird for the gnome panel:

sudo ./stap -v ~/process_creation.stp 

Pass 1: parsed user script and 43 library script(s) in 220usr/10sys/223real ms.
Pass 2: analyzed script: 5 probe(s), 7 function(s), 1 embed(s), 0 global(s) in 220usr/60sys/294real ms.
Pass 3: using cached .systemtap/cache/dd/stap_dd2b93e5305e7a0f5b95894e9f0d798a_2825.c
Pass 4: using cached .systemtap/cache/dd/stap_dd2b93e5305e7a0f5b95894e9f0d798a_2825.ko
Pass 5: starting run.
Tracking process creations .... 

hald-runner (2128) created 21509
hald-runner (21509) is exec'ing /usr/lib64/hal/scripts/hal-system-killswitch-get-power
hal-system-kill (21509) created 21510
hal-system-kill (21510) is exec'ing /usr/bin/hal-is-caller-privileged
hal-system-kill (21509) created 21511
hal-system-kill (21511) is exec'ing /bin/basename
hal-system-kill (21509) is exec'ing /usr/lib64/hal/scripts/linux/hal-system-killswitch-get-power-linux
hal-system-kill (21509) created 21512
hal-system-kill (21512) is exec'ing /usr/libexec/hal-ipw-killswitch-linux
gnome-panel (3031) created 21513
gnome-panel (21513) created 21514
gnome-panel (21514) is exec'ing /usr/lib64/qt-3.3/bin/thunderbird
gnome-panel (21514) is exec'ing /usr/kerberos/bin/thunderbird
gnome-panel (21514) is exec'ing /usr/lib64/ccache/thunderbird
gnome-panel (21514) is exec'ing /usr/local/bin/thunderbird
gnome-panel (21514) is exec'ing /usr/bin/thunderbird
thunderbird (21514) created 21515
thunderbird (21515) is exec'ing /bin/uname
thunderbird (21514) is exec'ing /usr/lib64/thunderbird-2.0.0.14/thunderbird
thunderbird (21514) created 21516
thunderbird (21516) is exec'ing /usr/bin/dirname
thunderbird (21514) created 21517
thunderbird (21517) is exec'ing /bin/basename
thunderbird (21514) created 21518
thunderbird (21518) is exec'ing /usr/lib64/thunderbird-2.0.0.14/run-mozilla.sh
run-mozilla.sh (21518) created 21519
run-mozilla.sh (21519) is exec'ing /bin/basename
run-mozilla.sh (21518) created 21520
run-mozilla.sh (21520) is exec'ing /usr/bin/dirname
run-mozilla.sh (21518) created 21521
run-mozilla.sh (21521) created 21522
run-mozilla.sh (21522) is exec'ing /usr/bin/which
run-mozilla.sh (21518) created 21523
run-mozilla.sh (21523) is exec'ing /usr/lib64/thunderbird-2.0.0.14/thunderbird-bin

Getting started with Systemtap (Part 2)

Posted by Phillip Muldoon
On June 12th, 2008 at 11:06

Permalink | Trackback | Links In |

No Comments |
Posted in Systemtap

I’ll continue part 2 of this article on how I built Systemtap from source and installed it.

After I fetched the  source with:

git clone git://sources.redhat.com/git/systemtap.git

A “systemtap” directory with source was created in my pwd. I like to build out-of-tree to keep the source pristine, so I created a new build directory:

mkdir systemtap_obj
cd systemtap_obj

and ran the configure step

../systemtap/configure

On a Fedora 9 LiveCD install, with a few extra custom rpm’s added, I found I had to install some libraries. The steps to install them are all a bit similar, but here is an example of a missing library error I encountered:

 ../systemtap/configure
checking sys/capability.h usability... no
checking sys/capability.h presence... no
checking for sys/capability.h... no
configure: error: cannot find required libcap header (libcap-devel may need to be installed)

And here is how I installed the library to fix for this error:

sudo yum install libcap-devel

I had to rerun the configure script several times to catch all the missing libraries.  In the end I had to install both libcap-devel, and elfutils-devel. Your experience may vary depending on your install.

And finally,  I built Systemtap with:

make

The build took a few minutes. I installed Systemtap with:

sudo make install

The whole process from fetching source, to building it, to installing it took less than five minutes, which was a pleasant surprise.

Tommorrow I’ll take a look at example scripts, but here is a neat example I ran:

sudo stap ~/systemtap/testsuite/systemtap.examples/syscalls_by_proc.stp
Collecting data... Type Ctrl-C to exit and display results

#SysCalls  Process Name

917        thunderbird-bin
807        firefox
489        hal-system-kill
390        tpb
206        dbus-daemon

Getting started with Systemtap (Part 1)

Posted by Phillip Muldoon
On June 11th, 2008 at 11:06

Permalink | Trackback | Links In |

No Comments |
Posted in Systemtap

I decided to write this up as a series of articles. I am really interested in the psychology of an individual becoming interested, using and hopefully participating in an open-source project.  So I decided to journal my experiences in a new project. I always like to dabble in side-projects as a hobby to my main job.  And Systemtap is so close to what I do,  so it became a natural choice.

So here is the first short journal of a newbie’s journey of getting involved with Systemtap.  I’ll keep the dispatches short. A dabbler’s use case, if you will. I’ve always wished that someone would do this for Frysk;  hackers - myself included - can sometimes lose the ground-level  perspective. I constantly worry that our projects are too technical, too complex and oblique to attract new developers. So as a new user of Systemtap, I thought, hey,  time to do what I ask for.

I’ll reproduce a lot of the instructions from the website with some small tweaks. The website for getting started is here:

http://sourceware.org/systemtap/getinvolved.html

Installing Systemtap from yum on Fedora 9

To install Systemtap from yum on Fedora, as a superuser (or sudo) do:

yum install systemtap kernel-devel

We’ll also need to install the kernel debuginfo packages. It is an important point to stress that  as your kernel updates, you also need to keep the debuginfo packages up to date as well. This caught me a few times, producing unreliable/inaccurate results when a mistmatch occured. To install the debuginfo:

 yum --enablerepo=updates-debuginfo install kernel-debuginfo

This is different than noted on the site. The yum command on the Getting Started Guide also enables my rawhide repo, and it installed the rawhide kernel debuginfo.  Your experience may vary.

And that is it. This will install the last release. And that’s ok. But … if I’m going to participate, I prefer to be at the leading edge. So I’ll be brave, and go straight to the source. Will need git for this, so install that  first:

yum install git

To get the source type this into a shell where you wish to fetch the Systemtap repo.

git clone git://sources.redhat.com/git/systemtap.git

Tomorrow I’ll write about building systemtap and running the examples

Hardware Watchpoints and Frysk 0.3

Posted by Phillip Muldoon
On May 30th, 2008 at 08:05

Permalink | Trackback | Links In |

No Comments |
Posted in Frysk

There is some beta/experimental hardware watchpoint code in Frysk 0.3. Give it a try and file bugs. Use the “watch” command from fhpd to access it. Also note these are purely hardware watchpoints, so sizes are 1, 2 and 4 bytes  (and 8 bytes on x8664). Teresa is working on some code for 0.4 that allows chaining of watchpoints together to watch bigger spaces.

Frysk man pages

Posted by Phillip Muldoon
On March 17th, 2008 at 09:03

Permalink | Trackback | Links In |

No Comments |
Posted in Frysk

Mark commented in one of the Frysk utility posts that the Frysk man pages are now online at the Frysk website. Cool. I’ll back-date the posts to point to them.

Development Tools and Change

Posted by Phillip Muldoon
On March 12th, 2008 at 09:03

Permalink | Trackback | Links In |

Comments (2) |
Posted in Frysk

A new task I am working on is hardware watchpoints. This is with Frysk, and it is just one of those things a debugger must eventually have. But as I worked on this, and other parts of Frysk during the last two years, my view of just what a debugger is (or is not) has changed . And it continues to evolve. Something I’ve been thinking a lot on lately, is the usefulness of a monolithic debugger. To be blunt, I think the days of one monolithic does-everything tool are gone. As software complexity increases - and so to the problems that software attempts to solve - so must the elegance, and smartness of the tools. Smart engineers need smart tools to write smart programs. Or something to that effect, I’m not a wordsmith. But simply put, I don’t think the old model of a massive, monolithic command-line or IDE based debugger is going to work anymore. They are too complex themselves, and not nimble enough to hop, skip and dance to the ever evolving and more complex demands of the development community. That being said, I’ve done a little exploring to see how things are evolving in this area. Recently I looked at Eclipse’s focus on the debugger/tool integration approach.

From that IDE down approach, I investigated the DSDP Eclipse project. This initiative seems to be making headway by addressing the many-tools idea. To me it indicates that authors of this project seem to think that integrating a debugger (I use the term broadly) into an IDE should be a debugger-team side task. And because they have published interfaces for this, I would logically seem to expect many tools to implement them. It avoids the use of wire protocols, and defines a set of interfaces to implement via API. They’ve basically opened the gates, and neatly put the solution to external debugger/tools integration into the hands of people who know their development-tool best.

So beyond my individual musings, when I see good work like this - this kind of outreach - I have to ask myself some fundamental questions. Are we in the development-tools community listening to our users? What do they want? Do they want this idea of more stand-alone tools solving specific problems? Are we listening to our own community?

I think the first battlefied for platform adoption is tools.

I took a brief look on Linux at the broad area of development tools. And I was really impressed at the quality of fantastic tools like Systemtap, Valgrind, Oprofile, and many other problem-domain solvers out there. These tools are already helping programmers solve problems that conventional debuggers either did imperfectly, or not at all. And because they are more problem-specific they have a different approach to the monumental scaling difficulties of the one monolithic debugger. This is good stuff, and I really like the range of services offered to the developer here.

So with all that in mind, and as I look out on the development tool horizon, I ask myself on the commonality of my work. Should I be attempting to write the hardware watchpoints to be as common and modular as possible, so that more tools can use it? Is this a domain-specific Frysk problem? Is such a generic rendering of the code fool-hardy? And what about language barriers? If the problem space of software forensic, fault-detection and remediation continues to trend this way, we should all be asking ourselves those questions.