Phil Muldoon Blog

Systemtap Editor home

Posted by Phillip Muldoon
On September 29th, 2008 at 14:09

Permalink | Trackback | Links In |

No Comments |
Posted in Eclipse, Systemtap

Thanks to all the emails, and suggestion regarding where to host the Systemtap Editor for Eclipse that I am hacking on. I ended up hosting it - under incubation - at the Eclipse Linux Distributions Project.

ViewVC of the subversion repository (ViewVC link)

The editor is under the Systemtap Module.

The danger of rainy weekends

Posted by Phillip Muldoon
On September 9th, 2008 at 10:09

Permalink | Trackback | Links In |

Comments (1) |
Posted in Eclipse, Systemtap

Besides Project Archer I have been mucking around with Systemtap. I’ve always had a bit of trouble writing Systemtap scripts - my brain is not big enough, or my practice high enough to write a comprehensive script without continually looking at the man-pages, language-reference guide or poking around in the Systemtap source. It makes for slow going sometimes.

A couple of days ago, I was chatting with  Frank and he mentioned that Systemtap can now generate coverage on Systemtap’s tapset library with:

stap -L tapset.*

I thought … hmmm.

I’ve been ittching to get back to some Eclipse hacking, and I’ve been waiting for something to come and scratch that itch.

I thought … hmmm.

It’s raining Saturday. “I’ll hack on this for a bit,” I thought.

hmmmmm ….

Well it ate up my whole weekend, but I hacked up a little Systemtap editor in Eclipse that offers syntax highlighting and probe completion.

Here is a view of the editor and completion:

Systemtap Editor Syscall Completion

And here is a screen-shot showing the completion window as we drill down through all the signal probes (in this case to signal.sy*)

Systemtep Editor Partial Signal Completion

I was very impressed with Eclipse, and how everything just worked on Fedora. It took about a day to get the completion, syntax highlighting and the engine-room work to generate the completion meta-data from Systemtap (have to do it dynamically and cache it). I’ll hack on this project as my “weekend” project - it is still pretty raw. I’ll put the plug-in and source up when I can work-out a place to host it.

Also, while I’m here I’d like to point you in the direction of another Systemtap UI. This has a different focus to what I hack on, and seems to concentrate more on execution. I am more focused on script development. It’s all good.

Project Archer

Posted by Phillip Muldoon
On August 3rd, 2008 at 11:08

Permalink | Trackback | Links In |

No Comments |
Posted in Frysk, GDB

I’ve started working on Project Archer with a few other hackers. The purpose is to improve C++ debugging with GDB. Under review is the Roadmap and the Development process. If you want to get involved either as a hacker, commentator, tester or are just generally interested, come find us on the Mailing list.

Cool little Systemtap scriptlet

Posted by Phillip Muldoon
On July 17th, 2008 at 20:07

Permalink | Trackback | Links In |

No Comments |
Posted in Systemtap

One of the things I’ve always found hard to do via ptrace is system-based state. Watching all processes across a system for a behaviour “trend”.  This is difficult as ptrace is not really designed for that. Frysk tried to address this in a different way. But Systemtap does it in a very scriptable way.

So  …. lately, I’ve been writing a series of articles around Systemtap, and I was hacking up a little script. I found this little tiny scriptlet very useful. It is so simple as well - and child’s play for the experienced Systemtap hackers out there. It simply watches every process for fork/clone and exec. It prints the name and pid for the processes involved. It also watches for a process exec and prints the process name, pid and executable to exec. The actual heavy lifting is done in 6 lines of code, which I find remarkable.

#! /usr/bin/env stap

probe begin {
print ("Tracking process creations .... \n\n")
}

probe process.create {
printf("%s (%d) created %d\n", execname(), pid(), new_pid)
}

probe process.exec {
printf("%s (%d) is exec'ing %s\n", execname(), pid(), filename)
}

probe end {
print("All done!\n")
}

Example output. During this script run, I run thunderbird for the gnome panel:

sudo ./stap -v ~/process_creation.stp 

Pass 1: parsed user script and 43 library script(s) in 220usr/10sys/223real ms.
Pass 2: analyzed script: 5 probe(s), 7 function(s), 1 embed(s), 0 global(s) in 220usr/60sys/294real ms.
Pass 3: using cached .systemtap/cache/dd/stap_dd2b93e5305e7a0f5b95894e9f0d798a_2825.c
Pass 4: using cached .systemtap/cache/dd/stap_dd2b93e5305e7a0f5b95894e9f0d798a_2825.ko
Pass 5: starting run.
Tracking process creations .... 

hald-runner (2128) created 21509
hald-runner (21509) is exec'ing /usr/lib64/hal/scripts/hal-system-killswitch-get-power
hal-system-kill (21509) created 21510
hal-system-kill (21510) is exec'ing /usr/bin/hal-is-caller-privileged
hal-system-kill (21509) created 21511
hal-system-kill (21511) is exec'ing /bin/basename
hal-system-kill (21509) is exec'ing /usr/lib64/hal/scripts/linux/hal-system-killswitch-get-power-linux
hal-system-kill (21509) created 21512
hal-system-kill (21512) is exec'ing /usr/libexec/hal-ipw-killswitch-linux
gnome-panel (3031) created 21513
gnome-panel (21513) created 21514
gnome-panel (21514) is exec'ing /usr/lib64/qt-3.3/bin/thunderbird
gnome-panel (21514) is exec'ing /usr/kerberos/bin/thunderbird
gnome-panel (21514) is exec'ing /usr/lib64/ccache/thunderbird
gnome-panel (21514) is exec'ing /usr/local/bin/thunderbird
gnome-panel (21514) is exec'ing /usr/bin/thunderbird
thunderbird (21514) created 21515
thunderbird (21515) is exec'ing /bin/uname
thunderbird (21514) is exec'ing /usr/lib64/thunderbird-2.0.0.14/thunderbird
thunderbird (21514) created 21516
thunderbird (21516) is exec'ing /usr/bin/dirname
thunderbird (21514) created 21517
thunderbird (21517) is exec'ing /bin/basename
thunderbird (21514) created 21518
thunderbird (21518) is exec'ing /usr/lib64/thunderbird-2.0.0.14/run-mozilla.sh
run-mozilla.sh (21518) created 21519
run-mozilla.sh (21519) is exec'ing /bin/basename
run-mozilla.sh (21518) created 21520
run-mozilla.sh (21520) is exec'ing /usr/bin/dirname
run-mozilla.sh (21518) created 21521
run-mozilla.sh (21521) created 21522
run-mozilla.sh (21522) is exec'ing /usr/bin/which
run-mozilla.sh (21518) created 21523
run-mozilla.sh (21523) is exec'ing /usr/lib64/thunderbird-2.0.0.14/thunderbird-bin

Getting started with Systemtap (Part 2)

Posted by Phillip Muldoon
On June 12th, 2008 at 11:06

Permalink | Trackback | Links In |

No Comments |
Posted in Systemtap

I’ll continue part 2 of this article on how I built Systemtap from source and installed it.

After I fetched the  source with:

git clone git://sources.redhat.com/git/systemtap.git

A “systemtap” directory with source was created in my pwd. I like to build out-of-tree to keep the source pristine, so I created a new build directory:

mkdir systemtap_obj
cd systemtap_obj

and ran the configure step

../systemtap/configure

On a Fedora 9 LiveCD install, with a few extra custom rpm’s added, I found I had to install some libraries. The steps to install them are all a bit similar, but here is an example of a missing library error I encountered:

 ../systemtap/configure
checking sys/capability.h usability... no
checking sys/capability.h presence... no
checking for sys/capability.h... no
configure: error: cannot find required libcap header (libcap-devel may need to be installed)

And here is how I installed the library to fix for this error:

sudo yum install libcap-devel

I had to rerun the configure script several times to catch all the missing libraries.  In the end I had to install both libcap-devel, and elfutils-devel. Your experience may vary depending on your install.

And finally,  I built Systemtap with:

make

The build took a few minutes. I installed Systemtap with:

sudo make install

The whole process from fetching source, to building it, to installing it took less than five minutes, which was a pleasant surprise.

Tommorrow I’ll take a look at example scripts, but here is a neat example I ran:

sudo stap ~/systemtap/testsuite/systemtap.examples/syscalls_by_proc.stp
Collecting data... Type Ctrl-C to exit and display results

#SysCalls  Process Name

917        thunderbird-bin
807        firefox
489        hal-system-kill
390        tpb
206        dbus-daemon

Getting started with Systemtap (Part 1)

Posted by Phillip Muldoon
On June 11th, 2008 at 11:06

Permalink | Trackback | Links In |

No Comments |
Posted in Systemtap

I decided to write this up as a series of articles. I am really interested in the psychology of an individual becoming interested, using and hopefully participating in an open-source project.  So I decided to journal my experiences in a new project. I always like to dabble in side-projects as a hobby to my main job.  And Systemtap is so close to what I do,  so it became a natural choice.

So here is the first short journal of a newbie’s journey of getting involved with Systemtap.  I’ll keep the dispatches short. A dabbler’s use case, if you will. I’ve always wished that someone would do this for Frysk;  hackers - myself included - can sometimes lose the ground-level  perspective. I constantly worry that our projects are too technical, too complex and oblique to attract new developers. So as a new user of Systemtap, I thought, hey,  time to do what I ask for.

I’ll reproduce a lot of the instructions from the website with some small tweaks. The website for getting started is here:

http://sourceware.org/systemtap/getinvolved.html

Installing Systemtap from yum on Fedora 9

To install Systemtap from yum on Fedora, as a superuser (or sudo) do:

yum install systemtap kernel-devel

We’ll also need to install the kernel debuginfo packages. It is an important point to stress that  as your kernel updates, you also need to keep the debuginfo packages up to date as well. This caught me a few times, producing unreliable/inaccurate results when a mistmatch occured. To install the debuginfo:

 yum --enablerepo=updates-debuginfo install kernel-debuginfo

This is different than noted on the site. The yum command on the Getting Started Guide also enables my rawhide repo, and it installed the rawhide kernel debuginfo.  Your experience may vary.

And that is it. This will install the last release. And that’s ok. But … if I’m going to participate, I prefer to be at the leading edge. So I’ll be brave, and go straight to the source. Will need git for this, so install that  first:

yum install git

To get the source type this into a shell where you wish to fetch the Systemtap repo.

git clone git://sources.redhat.com/git/systemtap.git

Tomorrow I’ll write about building systemtap and running the examples

Hardware Watchpoints and Frysk 0.3

Posted by Phillip Muldoon
On May 30th, 2008 at 08:05

Permalink | Trackback | Links In |

No Comments |
Posted in Frysk

There is some beta/experimental hardware watchpoint code in Frysk 0.3. Give it a try and file bugs. Use the “watch” command from fhpd to access it. Also note these are purely hardware watchpoints, so sizes are 1, 2 and 4 bytes  (and 8 bytes on x8664). Teresa is working on some code for 0.4 that allows chaining of watchpoints together to watch bigger spaces.

Frysk man pages

Posted by Phillip Muldoon
On March 17th, 2008 at 09:03

Permalink | Trackback | Links In |

No Comments |
Posted in Frysk

Mark commented in one of the Frysk utility posts that the Frysk man pages are now online at the Frysk website. Cool. I’ll back-date the posts to point to them.

Development Tools and Change

Posted by Phillip Muldoon
On March 12th, 2008 at 09:03

Permalink | Trackback | Links In |

Comments (2) |
Posted in Frysk

A new task I am working on is hardware watchpoints. This is with Frysk, and it is just one of those things a debugger must eventually have. But as I worked on this, and other parts of Frysk during the last two years, my view of just what a debugger is (or is not) has changed . And it continues to evolve. Something I’ve been thinking a lot on lately, is the usefulness of a monolithic debugger. To be blunt, I think the days of one monolithic does-everything tool are gone. As software complexity increases - and so to the problems that software attempts to solve - so must the elegance, and smartness of the tools. Smart engineers need smart tools to write smart programs. Or something to that effect, I’m not a wordsmith. But simply put, I don’t think the old model of a massive, monolithic command-line or IDE based debugger is going to work anymore. They are too complex themselves, and not nimble enough to hop, skip and dance to the ever evolving and more complex demands of the development community. That being said, I’ve done a little exploring to see how things are evolving in this area. Recently I looked at Eclipse’s focus on the debugger/tool integration approach.

From that IDE down approach, I investigated the DSDP Eclipse project. This initiative seems to be making headway by addressing the many-tools idea. To me it indicates that authors of this project seem to think that integrating a debugger (I use the term broadly) into an IDE should be a debugger-team side task. And because they have published interfaces for this, I would logically seem to expect many tools to implement them. It avoids the use of wire protocols, and defines a set of interfaces to implement via API. They’ve basically opened the gates, and neatly put the solution to external debugger/tools integration into the hands of people who know their development-tool best.

So beyond my individual musings, when I see good work like this - this kind of outreach - I have to ask myself some fundamental questions. Are we in the development-tools community listening to our users? What do they want? Do they want this idea of more stand-alone tools solving specific problems? Are we listening to our own community?

I think the first battlefied for platform adoption is tools.

I took a brief look on Linux at the broad area of development tools. And I was really impressed at the quality of fantastic tools like Systemtap, Valgrind, Oprofile, and many other problem-domain solvers out there. These tools are already helping programmers solve problems that conventional debuggers either did imperfectly, or not at all. And because they are more problem-specific they have a different approach to the monumental scaling difficulties of the one monolithic debugger. This is good stuff, and I really like the range of services offered to the developer here.

So with all that in mind, and as I look out on the development tool horizon, I ask myself on the commonality of my work. Should I be attempting to write the hardware watchpoints to be as common and modular as possible, so that more tools can use it? Is this a domain-specific Frysk problem? Is such a generic rendering of the code fool-hardy? And what about language barriers? If the problem space of software forensic, fault-detection and remediation continues to trend this way, we should all be asking ourselves those questions.

A tour of Frysk’s utilities (part 3)

Posted by Phillip Muldoon
On March 10th, 2008 at 11:03

Permalink | Trackback | Links In |

Comments (2) |
Posted in Frysk

This final whistle stop tour of Frysk’s utilites will cover: fstack, ftrace and fhpd.

fstack - Similar in functionality to pstack. This utility will display the process stack in a similar manner to pstack. This utility will work with a live process (ie specify fstack <pid>) or with a corefile (ie fstack <corefile <exe>). In the corefile case, the stack will represent how it was when the coredump was taken. For example:

sleep 5000 &
[1] 695

fcore 695
fstack core.695 /bin/sleep 

Task #695
#0 0x00000038cba9ac30 in __nanosleep_nocancel () from .../libc.so.6
#1 0x0000000000402f2b in rpl_nanosleep() .../nanosleep.c#71
#2 0x0000000000402a54 in xnanosleep() .../xnanosleep.c#100
#3 0x000000000040158c in main () from .../sleep
#4 0x00000038cba1e074 in __libc_start_main () from .../libc.so.6
#5 0x00000000004011d9 in _start () from .../sleep

ftrace - this utility will either attach to a process, or run an executable specified on the command-line, and trace its system-calls. For example, trace the write system-call in /binls:

ftrace  -sys="write" /bin/ls foo
2803.2803 attached /bin/ls

/bin/ls: 2803.2803 syscall write(2, "/bin/ls: ", 9) = -1 ERRNO=38
cannot access foo

2803.2803 syscall write(2, "cannot access foo", 17) = -1 ERRNO=38
: No such file or directory

2803.2803 syscall write(2, ": No such file or di...", 27) = -1 ERRNO=38

2803.2803 syscall write(2, " No such file or di...", 1) = -1 ERRNO=38
2803.2803 exited with status 2

In the above example if we wanted to look at all system calls, we would specify -sys”*”.

There are many useful actions and filters that ftrace can perform on the process. For example, -stack: this will print a stack back-trace whenever the matched system-call is detected in the process. A few more examples: specify “-p pid” to attach to an existing process. “-c” to trace a process’ children, “-m” to detect and print when a library is mapped/unmapped and so on. The scope of use is beyond this little blog post, so I encourage you to use and experiment.

fhpd - is a command-line debugger based on the Frysk core (engine if you will). It is largely based off the HPD specification. The scope and use of this debugger in a fair and consistent manner is well beyond this blog post, but we’ll look at a few brief examples. In this example, we’ll load up a core-file with its backing executable. You can of course, attach to a pid, or load an executable from the command-line.

fhpd core.695 /bin/sleep

Attached to core file: core.695

To look at the stack back-trace you would type:

(fhpd) where
#0 0x00000038cba9ac30 in __nanosleep_nocancel () from /lib64/libc.so.6
#1 0x0000000000402f2b in rpl_nanosleep () from /bin/sleep
#2 0x0000000000402a54 in xnanosleep () from /bin/sleep
#3 0x000000000040158c in main () from /bin/sleep
#4 0x00000038cba1e074 in __libc_start_main () from /lib64/libc.so.6
#5 0x00000000004011d9 in _start () from /bin/sleep

And to look at frame specific information:

(fhpd) down
#1 0x0000000000402f2b in rpl_nanosleep(const struct timespec {
__time_t tv_sec;
long int tv_nsec;
} * requested_delay,struct timespec {
__time_t tv_sec;
long int tv_nsec;
} * remaining_delay) /usr/src/debug/coreutils-6.9/lib/nanosleep.c#71

Followed by a source listing:

(fhpd) list
[0.0]
61     /* nanosleep mishandles large sleeps due to internal overflow
62        problems, so check that the proper amount of time has actually
63        elapsed.  */
64
65     struct timespec delay = *requested_delay;
66     struct timespec t0;
67     getnow (&t0);
68
69     for (;;)
70       {
->  71         int r = nanosleep (&delay, remaining_delay);
72         if (r == 0)
73          {
74            time_t secs_sofar;
75            struct timespec now;
76            getnow (&now);
77
78            secs_sofar = now.tv_sec - t0.tv_sec;
79            if (requested_delay->tv_sec < secs_sofar)
80              return 0;

I’ll stop here. There are many, many commands. We did not even look at breakpoints, or stepping or loading executables or hundreds of different things. But this blog is not a tutorial, rather a taste, and I encourage you to experiment and find out for yourself, and play around with fhpd. And where things are broken (Frysk is in constant development) submit patches, bug reports, or come and let us know on irc (irc.gimp.org, channel: #frysk).