Category Frysk


I’ve been working on Project Archer for some months now, and it has been pretty interesting. It has also been challenging. There are several deep dark wells of technical knowledge that I’ve had to explore in detail: unwinding, dwarf, debuginfo, and exceptions (generation, handling and personality routines). So I’ve been reading about, and stepping through a lot of these areas in GDB this last week. When does a program grow so big that one mortal human cannot work on its entirety? I don’t know the metric, but I bet GDB surpasses it.

As I’ve worked on improved C++ exception handling in GDB, it occurred to me that the different bugs I’ve filed could ultimately be put in one: “Make GDB work better with the GCC unwinder.” As GCC has changed in some areas, GDB has not changed in tandem with GCC. The next or finish commands relying purely on longjmp breakpoints is an example. (If you “next” over a C++ “throw” statement in GDB you will lose control of the inferior. GDB sets a “longjmp” breakpoint via the “next” command code to re-establish control – but the unwinder for C++ does not use setjmp/longjmp semantics to switch context. Once resumed, the inferior won’t stop at all, or where expected)

So this is a problem. It really irritates me when I lose control of an inferior when debugging. The pain is in proportion to the length of the debugging session. Sometimes I spend hours stepping a process. I’ve cursed a good line on several occasions where this has happened

It’s easy to see this negatively, and even easier to write a negative thing about it. But it is a fact of life. So what’s the problem? Well in most areas the longjmp trick will work. It won’t for C++ exceptions. But this grey area really bothers me. What if there are other areas where expectations do not match? Both GCC and GDB are highly complex programs. They change all the time, and where there is no direct transactional specification (ie debuginfo is written to a specification, so are elf binaries, and so on) the assumptions about how GCC generates code will eventually break. If they break in a big way, they will be fixed – and quickly. But if they break in minor little ways, then the user experience dies as a result of a thousand tiny paper cuts. Or a thousand tiny curses.

Hardware Watchpoints and Frysk 0.3

There is some beta/experimental hardware watchpoint code in Frysk 0.3. Give it a try and file bugs. Use the “watch” command from fhpd to access it. Also note these are purely hardware watchpoints, so sizes are 1, 2 and 4 bytes  (and 8 bytes on x8664). Teresa is working on some code for 0.4 that allows chaining of watchpoints together to watch bigger spaces.

Frysk man pages

Mark commented in one of the Frysk utility posts that the Frysk man pages are now online at the Frysk website. Cool. I’ll back-date the posts to point to them.

Development Tools and Change

A new task I am working on is hardware watchpoints. This is with Frysk, and it is just one of those things a debugger must eventually have. But as I worked on this, and other parts of Frysk during the last two years, my view of just what a debugger is (or is not) has changed . And it continues to evolve. Something I’ve been thinking a lot on lately, is the usefulness of a monolithic debugger. To be blunt, I think the days of one monolithic does-everything tool are gone. As software complexity increases – and so to the problems that software attempts to solve – so must the elegance, and smartness of the tools. Smart engineers need smart tools to write smart programs. Or something to that effect, I’m not a wordsmith. But simply put, I don’t think the old model of a massive, monolithic command-line or IDE based debugger is going to work anymore. They are too complex themselves, and not nimble enough to hop, skip and dance to the ever evolving and more complex demands of the development community. That being said, I’ve done a little exploring to see how things are evolving in this area. Recently I looked at Eclipse’s focus on the debugger/tool integration approach.

From that IDE down approach, I investigated the DSDP Eclipse project. This initiative seems to be making headway by addressing the many-tools idea. To me it indicates that authors of this project seem to think that integrating a debugger (I use the term broadly) into an IDE should be a debugger-team side task. And because they have published interfaces for this, I would logically seem to expect many tools to implement them. It avoids the use of wire protocols, and defines a set of interfaces to implement via API. They’ve basically opened the gates, and neatly put the solution to external debugger/tools integration into the hands of people who know their development-tool best.

So beyond my individual musings, when I see good work like this – this kind of outreach – I have to ask myself some fundamental questions. Are we in the development-tools community listening to our users? What do they want? Do they want this idea of more stand-alone tools solving specific problems? Are we listening to our own community?

I think the first battlefied for platform adoption is tools.

I took a brief look on Linux at the broad area of development tools. And I was really impressed at the quality of fantastic tools like Systemtap, Valgrind, Oprofile, and many other problem-domain solvers out there. These tools are already helping programmers solve problems that conventional debuggers either did imperfectly, or not at all. And because they are more problem-specific they have a different approach to the monumental scaling difficulties of the one monolithic debugger. This is good stuff, and I really like the range of services offered to the developer here.

So with all that in mind, and as I look out on the development tool horizon, I ask myself on the commonality of my work. Should I be attempting to write the hardware watchpoints to be as common and modular as possible, so that more tools can use it? Is this a domain-specific Frysk problem? Is such a generic rendering of the code fool-hardy? And what about language barriers? If the problem space of software forensic, fault-detection and remediation continues to trend this way, we should all be asking ourselves those questions.

A tour of Frysk’s utilities (part 3)

This final whistle stop tour of Frysk’s utilites will cover: fstack, ftrace and fhpd.

fstack – Similar in functionality to pstack. This utility will display the process stack in a similar manner to pstack. This utility will work with a live process (ie specify fstack <pid>) or with a corefile (ie fstack <corefile <exe>). In the corefile case, the stack will represent how it was when the coredump was taken. For example:

sleep 5000 &
[1] 695

fcore 695
fstack core.695 /bin/sleep 

Task #695
#0 0x00000038cba9ac30 in __nanosleep_nocancel () from .../
#1 0x0000000000402f2b in rpl_nanosleep() .../nanosleep.c#71
#2 0x0000000000402a54 in xnanosleep() .../xnanosleep.c#100
#3 0x000000000040158c in main () from .../sleep
#4 0x00000038cba1e074 in __libc_start_main () from .../
#5 0x00000000004011d9 in _start () from .../sleep

ftrace – this utility will either attach to a process, or run an executable specified on the command-line, and trace its system-calls. For example, trace the write system-call in /binls:

ftrace  -sys="write" /bin/ls foo
2803.2803 attached /bin/ls

/bin/ls: 2803.2803 syscall write(2, "/bin/ls: ", 9) = -1 ERRNO=38
cannot access foo

2803.2803 syscall write(2, "cannot access foo", 17) = -1 ERRNO=38
: No such file or directory

2803.2803 syscall write(2, ": No such file or di...", 27) = -1 ERRNO=38

2803.2803 syscall write(2, " No such file or di...", 1) = -1 ERRNO=38
2803.2803 exited with status 2

In the above example if we wanted to look at all system calls, we would specify -sys”*”.

There are many useful actions and filters that ftrace can perform on the process. For example, -stack: this will print a stack back-trace whenever the matched system-call is detected in the process. A few more examples: specify “-p pid” to attach to an existing process. “-c” to trace a process’ children, “-m” to detect and print when a library is mapped/unmapped and so on. The scope of use is beyond this little blog post, so I encourage you to use and experiment.

fhpd – is a command-line debugger based on the Frysk core (engine if you will). It is largely based off the HPD specification. The scope and use of this debugger in a fair and consistent manner is well beyond this blog post, but we’ll look at a few brief examples. In this example, we’ll load up a core-file with its backing executable. You can of course, attach to a pid, or load an executable from the command-line.

fhpd core.695 /bin/sleep

Attached to core file: core.695

To look at the stack back-trace you would type:

(fhpd) where
#0 0x00000038cba9ac30 in __nanosleep_nocancel () from /lib64/
#1 0x0000000000402f2b in rpl_nanosleep () from /bin/sleep
#2 0x0000000000402a54 in xnanosleep () from /bin/sleep
#3 0x000000000040158c in main () from /bin/sleep
#4 0x00000038cba1e074 in __libc_start_main () from /lib64/
#5 0x00000000004011d9 in _start () from /bin/sleep

And to look at frame specific information:

(fhpd) down
#1 0x0000000000402f2b in rpl_nanosleep(const struct timespec {
__time_t tv_sec;
long int tv_nsec;
} * requested_delay,struct timespec {
__time_t tv_sec;
long int tv_nsec;
} * remaining_delay) /usr/src/debug/coreutils-6.9/lib/nanosleep.c#71

Followed by a source listing:

(fhpd) list
61     /* nanosleep mishandles large sleeps due to internal overflow
62        problems, so check that the proper amount of time has actually
63        elapsed.  */
65     struct timespec delay = *requested_delay;
66     struct timespec t0;
67     getnow (&t0);
69     for (;;)
70       {
->  71         int r = nanosleep (&delay, remaining_delay);
72         if (r == 0)
73          {
74            time_t secs_sofar;
75            struct timespec now;
76            getnow (&now);
78            secs_sofar = now.tv_sec - t0.tv_sec;
79            if (requested_delay->tv_sec < secs_sofar)
80              return 0;

I’ll stop here. There are many, many commands. We did not even look at breakpoints, or stepping or loading executables or hundreds of different things. But this blog is not a tutorial, rather a taste, and I encourage you to experiment and find out for yourself, and play around with fhpd. And where things are broken (Frysk is in constant development) submit patches, bug reports, or come and let us know on irc (, channel: #frysk).

A tour of Frysk’s utilities (part 2)

This round-up (part 2 of 3) will look at ferror, fexe and fmap. The next (and final in this series) blog entry will covers: fstack, ftrace and fhpd.

ferror- This is a utility that will help you find the source of your programming errors. In does a lot of grepping for you, and allows you to work back from the error message it catches. It does this by watching for the write() system call to be executed in the process or executable specified. When ferror sees a write system call, it matches the write arguments to the search string you provided. When the match occurs, a back-trace is printed. I find my main use is to extract just where an error is occurring/originates, and working back from that origin. For example:

ferror -e "No such file or directory" /bin/ls foo/null

Tracing 22661.22661

/bin/ls: cannot access foo/null: No such file or directory

Process is trying to output No such file or directory

Stack trace:

Task #22661
#0 0x00000038cbac6e80 in __write_nocancel () from .../
#1 0x00000038cba6c343 in _IO_file_write@@GLIBC_2.2.5 () from .../
#2 0x00000038cba6d803 in _IO_file_xsputn@@GLIBC_2.2.5 () from .../
#3 0x00000038cba475e8 in buffered_vfprintf () from .../
#4 0x00000038cba430bf in _IO_vfprintf () from .../
#5 0x00000038cba6133b in __fxprintf () from .../
#6 0x00000038cbad3d07 in error_tail () from .../
#7 0x00000038cbad4083 in __error () from .../
#8 0x0000000000402f9b in [unknown] from .../ls
#9 0x0000000000403e0f in [unknown] from .../ls
#10 0x0000000000407542 in [unknown] from .../ls
#11 0x00000038cba1e074 in __libc_start_main () from .../
#12 0x0000000000402369 in [unknown] from .../ls

This above trace gives you a good indication of where to start (and finish)

fexe – This utlity will print the backing executable behind a pid or corefle. For example:

 fexe 17305


fmaps – This prints out the process maps of a corefile (simulated), a process (actual from /proc/self/maps) or an on-file excutable (simulated to executable maps). Lets take a corefile example:

sleep 5000 &
[2] 25681

fcore 25681

fmaps core.25681 /bin/sleep 

0x400000-0x405000 r-xp 0x0 -1:-1 -1 /bin/sleep
0x604000-0x605000 rw-p 0x0 -1:-1 -1 /bin/sleep
0x605000-0x626000 rw-p 0x0 -1:-1 -1 /bin/sleep
0x3235400000-0x3235408000 r-xp 0x0 -1:-1 -1 /lib64/
0x3235607000-0x3235608000 r--p 0x0 -1:-1 -1 /lib64/
0x3235608000-0x3235609000 rw-p 0x0 -1:-1 -1 /lib64/
0x38ca800000-0x38ca81b000 r-xp 0x0 -1:-1 -1 /lib64/
0x38caa1a000-0x38caa1b000 r--p 0x0 -1:-1 -1 /lib64/
0x38caa1b000-0x38caa1c000 rw-p 0x0 -1:-1 -1 /lib64/
0x38cba00000-0x38cbb4d000 r-xp 0x0 -1:-1 -1 /lib64/
0x38cbd4d000-0x38cbd51000 r--p 0x0 -1:-1 -1 /lib64/
0x38cbd51000-0x38cbd52000 rw-p 0x0 -1:-1 -1 /lib64/
0x38cbd52000-0x38cbd57000 rw-p 0x0 -1:-1 -1 /lib64/
0x38cc600000-0x38cc616000 r-xp 0x0 -1:-1 -1 /lib64/
0x38cc815000-0x38cc816000 r--p 0x0 -1:-1 -1 /lib64/
0x38cc816000-0x38cc817000 rw-p 0x0 -1:-1 -1 /lib64/
0x38cc817000-0x38cc81b000 rw-p 0x0 -1:-1 -1 /lib64/
0x2aaaaaaab000-0x2aaaaaaad000 rw-p 0x0 -1:-1 -1
0x2aaaaaad2000-0x2aaaaaad4000 rw-p 0x0 -1:-1 -1
0x2aaaaaad4000-0x2aaaaf52e000 r--p 0x0 -1:-1 -1
0x7ffff381c000-0x7ffff3831000 rw-p 0x0 -1:-1 -1
0x7ffff39fe000-0x7ffff3a00000 r-xp 0x0 -1:-1 -1 [vdso]
0xffffffffff600000-0xffffffffff601000 r-xp 0x0 -1:-1 -1

An important point to note in this example; the actual process maps are not stored in a corefile – they have to be reconstructed and “simulated”. This is done by disassembling the linkmap table in the corefile, and querying each elf file object noted in the linkmap. It collates the elf object file-maps and builds thee map-table.

With the live process use case: fmaps 1234 would print out the actual maps as described in the /proc/self/maps table.

A tour of Frysk’s utilities (part 1)

With Frysk, one of the things we have been pretty good at it is extrapolating functionality to the user when we write it. Normally that is a command-line utility like: fmaps as well as a command function in fhpd: info maps.

Even though each one of these has a man page, and an actual help page in fhpd, it would probably be a good idea to do a round-up. We’ll take a look at fcore, fauxv, fcatch, fdebuginfo and fdebugrpm today, and others in the next blog entry.

fcore – This will capture a corefile from a running process. It attaches to a process, blocks all its threads, creates a corefile of the process and then unblocks all threads. E.g:

sleep 5000 &
[1] 24325

fcore 24325

ls -lash core.24325
75M -rw-r--r-- 1 test test 77M 2008-03-03 08:37 core.24325

fauxv – This will print out the process auxiliary from either a process, executable or corefile

 fauxv core.24325 /bin/sleep

AT_HWCAP (Machine dependent hints about) : 0x178bfbff
AT_PAGESZ (System page size) : 4096
AT_CLKTCK (Frequency of times()) : 100
AT_PHDR (Program headers for program) : 0x400040
AT_PHENT (Size of program header entry) : 56
AT_PHNUM (Number of program headers) : 8
AT_BASE (Base address of interpreter) : 0
AT_FLAGS (Flags) : 0
AT_ENTRY (Entry point of program) : [unknown] (0x4011b0)
AT_UID (Real uid) : 500
AT_EUID (Effective uid) : 500
AT_GID (Real gid) : 500
AT_EGID (Effective gid) : 500
AT_0x17 (AT_0x17) : 0
AT_PLATFORM (String identifying platform.) : x86_64
AT_NULL (End of vector) : 0

fcatch – If given a pid, it will attach to a running process. If given an executable, will run and then attach to that process. When attached fcatch will monitor the process for error conditions (eg sigsegv) and when it detects one, will print a stack back-trace:

 fcatch ../pkglibdir/funit-hello
fcatch: from PID 24348 TID 24348:
SIGSEGV(11) detected - dumping stack trace for TID 24348
#0 0x00000000004004bc in print () from /test/frysk_bin/frysk-core/frysk/pkglibdir/funit-hello
#1 0x00000000004004eb in main () from /test/frysk_bin/frysk-core/frysk/pkglibdir/funit-hello
#2 0x0000003eb641e074 in __libc_start_main () from /lib64/
#3 0x0000000000400409 in _start () from /test/frysk_bin/frysk-core/frysk/pkglibdir/funit-hello

fdebuginfo and fdebugrpm – These utilities are so close in functionality we’ll look at them together. fdebuginfo will print out the debuginfo known to be installed on a system for a pid, executable or corefile:

 fdebuginfo 24325
/bin/sleep ---
/lib64/ ---
/lib64/ ---
/lib64/ ---
/lib64/ ---

And fdebugrpm will install that debuginfo and dependencies:

 fdebugrpm 24325

Missing Debuginfo package(s)

Do you wish to install the above packages? [y/n]: y

--> Running transaction check
---> Package glibc-debuginfo.x86_64 0:2.7-2 set to be updated
---> Package coreutils-debuginfo.x86_64 0:6.9-13.fc8 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

Package                 Arch       Version          Repository        Size
coreutils-debuginfo     x86_64     6.9-13.fc8       updates-debuginfo  3.5 M
glibc-debuginfo         x86_64     2.7-2            fedora-debuginfo   20 M

Transaction Summary
Install      2 Package(s)
Update       0 Package(s)
Remove       0 Package(s)

Total download size: 24 M
Is this ok [y/N]: y

Running fdebuginfo again after fdebugrpm:

fdebuginfo 24325
/bin/sleep /usr/lib/debug/bin/sleep.debug
/lib64/ /usr/lib/debug/lib64/
/lib64/ /usr/lib/debug/lib64/
/lib64/ /usr/lib/debug/lib64/
/lib64/ /usr/lib/debug/lib64/

As I hinted at earlier, both these commands work with executables, pids, and corefiles. so:

fdebuginfo core.24325 /bin/sleep

is just as valid.

Hope you enjoyed part 1.

Frysk and selective coredumps

Been working hard on Frysk over the last year, and hardware watch points are coming along quite nicely. I stopped back in core file land again recently, to update some code and add a few features. There was a bit of discussion some months ago on creating partial or selective user-land core dumps. As a core dump can be very large (especially if the process’s heap is large) then the resulting file construction can be time consuming. So added a feature to fcore (the Frysk core dump command line utility) to allow selective segment inclusion and exclusion.

So for example:

sleep 100 &

[1] 7480

fcore -o partialcore -segments=”stack|heap|vdso|($^)” 7480

Will dump the segments that match the program’s: stack, heap, its virtual dynamic shared object (vdso) and any maps without a name (maps not backed with a file). As core files are sparse, we have to use the -s option is ls to look at the size difference:

X86_64, Fedora 8

frysk-core/frysk/bindir/fcore -o partialcore -segments=”stack|heap|vdso|($^)” 7480
frysk-core/frysk/bindir/fcore -o fullcore -allmaps 7480
frysk-core/frysk/bindir/fcore -o normalcore 7480

ls -lash partialcore.7480 fullcore.7480 normalcore.7480
77M -rw-r–r– 1 test test 77M 2008-02-29 10:01 fullcore.7480
75M -rw-r–r– 1 test test 77M 2008-02-29 10:02 normalcore.7480
312K -rw-r–r– 1 test test 77M 2008-02-29 10:01 partialcore.7480

These are small examples, but I’d like to see if this feature is useful to people. If it proves to be so, it can expanded to other criteria beyond the map name. Right now this exists in Frysk’s GIT repository which can be cloned at:

git clone ssh://

And you can reach us on irc at: #frysk, at

It should find its way into the frysk rpm in Fedora 8 very soon.

Frysk and Core Dumps

Daylight! I managed to get some interesting initial code working last week. Finally managed to get the Frysk (Java based) API for representing core dumps into something usable (at least for .note data). It follows the Host = Machine, Proc = Process, Task = Thread abstract model that we already use to represent Linux Ptrace Hosts. So …

Host coreHost = new LinuxCoreFileHost(Manager.eventLoop, File(“your-core-file”));

Will read in the core file, decipher the .notes segment, and build one Proc for the PRPSINFO header and data. After the Proc is built, it will add n amount of Tasks for each PRSTATUS it finds, and parent them to the Proc. It will also build the process auxv data from the AUX note entry. FPREGSET is not quite done yet, but “soon”.

After the core file is loaded into a Host, you might want to look at the process represented there. This can be done in the usual ways, via observers, finders or simply like:

Proc proc = coreHost.getProc(new ProcId(31497));

Now you can look at the process’s data (including process auxiliary data), the command line the process was executed with, the pid, ppid, sid, etc the process was operating under when it was core dumped. And lots more.

If you want to look at the threads that were owned by that process when it was core dumped, you can get the main thread using something like:

Task task = proc.getMainTask();

and all tasks via the getTasks() api.

To look at a thread’s register in a core file:

Isa isa = task.getIsa();
long ebx = isa.getRegisterByName(“ebx”).get(task)) ;
System.out.println(“ebx register = ” + ebx);

All of these interfaces are the same as the Frysk Ptrace interfaces, so slotting in a core file where a Ptrace process was previously used before, is just a matter of abstraction. My ultimate goal is when you load a core file into the Frysk UI, it will open the source browser, navigate to the fault location, show the code (and highlight the fault), and provide an interactive view of the backtrace.

There is still a lot of work to do, especially constructing the memory interface so that it is indistinguisable from a “live process” memory access, and also reconstructing the segments of the process map that were not captured in the corefile.

On another note, there is a new Frysk planet. There are also other Frysk bloggers busily writing articles. Sami Wagiaalla and Mike Cvet have been working on, and writing about the Frysk UI.

Frysk Utilities

In the last couple of months, we’ve been working on Frysk, the Frysk UI, but also a collection of standalone utilities. These use the frysk core to do most of the system level work. A few of the ones we are working on at the moment:

  • fcore – a utility to attach to a running process, extract and write a multi-threaded core dump (with notes), detach and let the process continue.
  • fstep – a utility to step over each instruction in a process, and print the corresponding asm to the console.
  • fstack – a utility to attach to a running process, extract a multi-threaded stack-trace from a running process, detach and let the process continue.
  • ftrace – a utility to trace system calls that a process is currently executing.
  • fcatch* – a utility to run a program until termination, then print out a stack-trace on exit (so one can see the back-trace of where it exited).

There are a lot of areas these tools can be useful, especially in directed use and not the more generic “watch and observe” behavior of the Frysk UI. There are also a lot of places where we need more. fauxv for example would print out a running process’s auxv data in human readable form. Any more ideas for directed utilities?

* fcatch is a temporary name. We asked on the Frysk list for ideas for names.

Copyright © Phil Muldoon

Built on Notes Blog Core
Powered by WordPress