Mark commented in one of the Frysk utility posts that the Frysk man pages are now online at the Frysk website. Cool. I’ll back-date the posts to point to them.
Development Tools and Change
A new task I am working on is hardware watchpoints. This is with Frysk, and it is just one of those things a debugger must eventually have. But as I worked on this, and other parts of Frysk during the last two years, my view of just what a debugger is (or is not) has changed . And it continues to evolve. Something I’ve been thinking a lot on lately, is the usefulness of a monolithic debugger. To be blunt, I think the days of one monolithic does-everything tool are gone. As software complexity increases – and so to the problems that software attempts to solve – so must the elegance, and smartness of the tools. Smart engineers need smart tools to write smart programs. Or something to that effect, I’m not a wordsmith. But simply put, I don’t think the old model of a massive, monolithic command-line or IDE based debugger is going to work anymore. They are too complex themselves, and not nimble enough to hop, skip and dance to the ever evolving and more complex demands of the development community. That being said, I’ve done a little exploring to see how things are evolving in this area. Recently I looked at Eclipse’s focus on the debugger/tool integration approach.
From that IDE down approach, I investigated the DSDP Eclipse project. This initiative seems to be making headway by addressing the many-tools idea. To me it indicates that authors of this project seem to think that integrating a debugger (I use the term broadly) into an IDE should be a debugger-team side task. And because they have published interfaces for this, I would logically seem to expect many tools to implement them. It avoids the use of wire protocols, and defines a set of interfaces to implement via API. They’ve basically opened the gates, and neatly put the solution to external debugger/tools integration into the hands of people who know their development-tool best.
So beyond my individual musings, when I see good work like this – this kind of outreach – I have to ask myself some fundamental questions. Are we in the development-tools community listening to our users? What do they want? Do they want this idea of more stand-alone tools solving specific problems? Are we listening to our own community?
I think the first battlefied for platform adoption is tools.
I took a brief look on Linux at the broad area of development tools. And I was really impressed at the quality of fantastic tools like Systemtap, Valgrind, Oprofile, and many other problem-domain solvers out there. These tools are already helping programmers solve problems that conventional debuggers either did imperfectly, or not at all. And because they are more problem-specific they have a different approach to the monumental scaling difficulties of the one monolithic debugger. This is good stuff, and I really like the range of services offered to the developer here.
So with all that in mind, and as I look out on the development tool horizon, I ask myself on the commonality of my work. Should I be attempting to write the hardware watchpoints to be as common and modular as possible, so that more tools can use it? Is this a domain-specific Frysk problem? Is such a generic rendering of the code fool-hardy? And what about language barriers? If the problem space of software forensic, fault-detection and remediation continues to trend this way, we should all be asking ourselves those questions.
A tour of Frysk’s utilities (part 3)
This final whistle stop tour of Frysk’s utilites will cover: fstack, ftrace and fhpd.
fstack – Similar in functionality to pstack. This utility will display the process stack in a similar manner to pstack. This utility will work with a live process (ie specify fstack <pid>) or with a corefile (ie fstack <corefile <exe>). In the corefile case, the stack will represent how it was when the coredump was taken. For example:
sleep 5000 & [1] 695 fcore 695 fstack core.695 /bin/sleep Task #695 #0 0x00000038cba9ac30 in __nanosleep_nocancel () from .../libc.so.6 #1 0x0000000000402f2b in rpl_nanosleep() .../nanosleep.c#71 #2 0x0000000000402a54 in xnanosleep() .../xnanosleep.c#100 #3 0x000000000040158c in main () from .../sleep #4 0x00000038cba1e074 in __libc_start_main () from .../libc.so.6 #5 0x00000000004011d9 in _start () from .../sleep
ftrace – this utility will either attach to a process, or run an executable specified on the command-line, and trace its system-calls. For example, trace the write system-call in /binls:
ftrace -sys="write" /bin/ls foo 2803.2803 attached /bin/ls /bin/ls: 2803.2803 syscall write(2, "/bin/ls: ", 9) = -1 ERRNO=38 cannot access foo 2803.2803 syscall write(2, "cannot access foo", 17) = -1 ERRNO=38 : No such file or directory 2803.2803 syscall write(2, ": No such file or di...", 27) = -1 ERRNO=38 2803.2803 syscall write(2, " No such file or di...", 1) = -1 ERRNO=38 2803.2803 exited with status 2
In the above example if we wanted to look at all system calls, we would specify -sys”*”.
There are many useful actions and filters that ftrace can perform on the process. For example, -stack: this will print a stack back-trace whenever the matched system-call is detected in the process. A few more examples: specify “-p pid” to attach to an existing process. “-c” to trace a process’ children, “-m” to detect and print when a library is mapped/unmapped and so on. The scope of use is beyond this little blog post, so I encourage you to use and experiment.
fhpd – is a command-line debugger based on the Frysk core (engine if you will). It is largely based off the HPD specification. The scope and use of this debugger in a fair and consistent manner is well beyond this blog post, but we’ll look at a few brief examples. In this example, we’ll load up a core-file with its backing executable. You can of course, attach to a pid, or load an executable from the command-line.
fhpd core.695 /bin/sleep Attached to core file: core.695
To look at the stack back-trace you would type:
(fhpd) where #0 0x00000038cba9ac30 in __nanosleep_nocancel () from /lib64/libc.so.6 #1 0x0000000000402f2b in rpl_nanosleep () from /bin/sleep #2 0x0000000000402a54 in xnanosleep () from /bin/sleep #3 0x000000000040158c in main () from /bin/sleep #4 0x00000038cba1e074 in __libc_start_main () from /lib64/libc.so.6 #5 0x00000000004011d9 in _start () from /bin/sleep
And to look at frame specific information:
(fhpd) down
#1 0x0000000000402f2b in rpl_nanosleep(const struct timespec {
__time_t tv_sec;
long int tv_nsec;
} * requested_delay,struct timespec {
__time_t tv_sec;
long int tv_nsec;
} * remaining_delay) /usr/src/debug/coreutils-6.9/lib/nanosleep.c#71
Followed by a source listing:
(fhpd) list
[0.0]
61 /* nanosleep mishandles large sleeps due to internal overflow
62 problems, so check that the proper amount of time has actually
63 elapsed. */
64
65 struct timespec delay = *requested_delay;
66 struct timespec t0;
67 getnow (&t0);
68
69 for (;;)
70 {
-> 71 int r = nanosleep (&delay, remaining_delay);
72 if (r == 0)
73 {
74 time_t secs_sofar;
75 struct timespec now;
76 getnow (&now);
77
78 secs_sofar = now.tv_sec - t0.tv_sec;
79 if (requested_delay->tv_sec < secs_sofar)
80 return 0;
I’ll stop here. There are many, many commands. We did not even look at breakpoints, or stepping or loading executables or hundreds of different things. But this blog is not a tutorial, rather a taste, and I encourage you to experiment and find out for yourself, and play around with fhpd. And where things are broken (Frysk is in constant development) submit patches, bug reports, or come and let us know on irc (irc.gimp.org, channel: #frysk).
A tour of Frysk’s utilities (part 2)
This round-up (part 2 of 3) will look at ferror, fexe and fmap. The next (and final in this series) blog entry will covers: fstack, ftrace and fhpd.
ferror- This is a utility that will help you find the source of your programming errors. In does a lot of grepping for you, and allows you to work back from the error message it catches. It does this by watching for the write() system call to be executed in the process or executable specified. When ferror sees a write system call, it matches the write arguments to the search string you provided. When the match occurs, a back-trace is printed. I find my main use is to extract just where an error is occurring/originates, and working back from that origin. For example:
ferror -e "No such file or directory" /bin/ls foo/null Tracing 22661.22661 /bin/ls: cannot access foo/null: No such file or directory Process is trying to output No such file or directory Stack trace: Task #22661 #0 0x00000038cbac6e80 in __write_nocancel () from .../libc-2.7.so #1 0x00000038cba6c343 in _IO_file_write@@GLIBC_2.2.5 () from .../libc-2.7.so #2 0x00000038cba6d803 in _IO_file_xsputn@@GLIBC_2.2.5 () from .../libc-2.7.so #3 0x00000038cba475e8 in buffered_vfprintf () from .../libc-2.7.so #4 0x00000038cba430bf in _IO_vfprintf () from .../libc-2.7.so #5 0x00000038cba6133b in __fxprintf () from .../libc-2.7.so #6 0x00000038cbad3d07 in error_tail () from .../libc-2.7.so #7 0x00000038cbad4083 in __error () from .../libc-2.7.so #8 0x0000000000402f9b in [unknown] from .../ls #9 0x0000000000403e0f in [unknown] from .../ls #10 0x0000000000407542 in [unknown] from .../ls #11 0x00000038cba1e074 in __libc_start_main () from .../libc-2.7.so #12 0x0000000000402369 in [unknown] from .../ls
This above trace gives you a good indication of where to start (and finish)
fexe – This utlity will print the backing executable behind a pid or corefle. For example:
fexe 17305 /usr/lib64/firefox-2.0.0.12/firefox-bin
fmaps – This prints out the process maps of a corefile (simulated), a process (actual from /proc/self/maps) or an on-file excutable (simulated to executable maps). Lets take a corefile example:
sleep 5000 & [2] 25681 fcore 25681 fmaps core.25681 /bin/sleep 0x400000-0x405000 r-xp 0x0 -1:-1 -1 /bin/sleep 0x604000-0x605000 rw-p 0x0 -1:-1 -1 /bin/sleep 0x605000-0x626000 rw-p 0x0 -1:-1 -1 /bin/sleep 0x3235400000-0x3235408000 r-xp 0x0 -1:-1 -1 /lib64/librt.so.1 0x3235607000-0x3235608000 r--p 0x0 -1:-1 -1 /lib64/librt.so.1 0x3235608000-0x3235609000 rw-p 0x0 -1:-1 -1 /lib64/librt.so.1 0x38ca800000-0x38ca81b000 r-xp 0x0 -1:-1 -1 /lib64/ld-linux-x86-64.so.2 0x38caa1a000-0x38caa1b000 r--p 0x0 -1:-1 -1 /lib64/ld-linux-x86-64.so.2 0x38caa1b000-0x38caa1c000 rw-p 0x0 -1:-1 -1 /lib64/ld-linux-x86-64.so.2 0x38cba00000-0x38cbb4d000 r-xp 0x0 -1:-1 -1 /lib64/libc.so.6 0x38cbd4d000-0x38cbd51000 r--p 0x0 -1:-1 -1 /lib64/libc.so.6 0x38cbd51000-0x38cbd52000 rw-p 0x0 -1:-1 -1 /lib64/libc.so.6 0x38cbd52000-0x38cbd57000 rw-p 0x0 -1:-1 -1 /lib64/libc.so.6 0x38cc600000-0x38cc616000 r-xp 0x0 -1:-1 -1 /lib64/libpthread.so.0 0x38cc815000-0x38cc816000 r--p 0x0 -1:-1 -1 /lib64/libpthread.so.0 0x38cc816000-0x38cc817000 rw-p 0x0 -1:-1 -1 /lib64/libpthread.so.0 0x38cc817000-0x38cc81b000 rw-p 0x0 -1:-1 -1 /lib64/libpthread.so.0 0x2aaaaaaab000-0x2aaaaaaad000 rw-p 0x0 -1:-1 -1 0x2aaaaaad2000-0x2aaaaaad4000 rw-p 0x0 -1:-1 -1 0x2aaaaaad4000-0x2aaaaf52e000 r--p 0x0 -1:-1 -1 0x7ffff381c000-0x7ffff3831000 rw-p 0x0 -1:-1 -1 0x7ffff39fe000-0x7ffff3a00000 r-xp 0x0 -1:-1 -1 [vdso] 0xffffffffff600000-0xffffffffff601000 r-xp 0x0 -1:-1 -1
An important point to note in this example; the actual process maps are not stored in a corefile – they have to be reconstructed and “simulated”. This is done by disassembling the linkmap table in the corefile, and querying each elf file object noted in the linkmap. It collates the elf object file-maps and builds thee map-table.
With the live process use case: fmaps 1234 would print out the actual maps as described in the /proc/self/maps table.
A tour of Frysk’s utilities (part 1)
With Frysk, one of the things we have been pretty good at it is extrapolating functionality to the user when we write it. Normally that is a command-line utility like: fmaps as well as a command function in fhpd: info maps.
Even though each one of these has a man page, and an actual help page in fhpd, it would probably be a good idea to do a round-up. We’ll take a look at fcore, fauxv, fcatch, fdebuginfo and fdebugrpm today, and others in the next blog entry.
fcore – This will capture a corefile from a running process. It attaches to a process, blocks all its threads, creates a corefile of the process and then unblocks all threads. E.g:
sleep 5000 & [1] 24325 fcore 24325 ls -lash core.24325 75M -rw-r--r-- 1 test test 77M 2008-03-03 08:37 core.24325
fauxv – This will print out the process auxiliary from either a process, executable or corefile
fauxv core.24325 /bin/sleep AT_SYSINFO_EHDR (SYSINFO EHDR) : 0x7fff0d1fe000 AT_HWCAP (Machine dependent hints about) : 0x178bfbff AT_PAGESZ (System page size) : 4096 AT_CLKTCK (Frequency of times()) : 100 AT_PHDR (Program headers for program) : 0x400040 AT_PHENT (Size of program header entry) : 56 AT_PHNUM (Number of program headers) : 8 AT_BASE (Base address of interpreter) : 0 AT_FLAGS (Flags) : 0 AT_ENTRY (Entry point of program) : [unknown] (0x4011b0) AT_UID (Real uid) : 500 AT_EUID (Effective uid) : 500 AT_GID (Real gid) : 500 AT_EGID (Effective gid) : 500 AT_0x17 (AT_0x17) : 0 AT_PLATFORM (String identifying platform.) : x86_64 AT_NULL (End of vector) : 0
fcatch – If given a pid, it will attach to a running process. If given an executable, will run and then attach to that process. When attached fcatch will monitor the process for error conditions (eg sigsegv) and when it detects one, will print a stack back-trace:
fcatch ../pkglibdir/funit-hello fcatch: from PID 24348 TID 24348: SIGSEGV(11) detected - dumping stack trace for TID 24348 #0 0x00000000004004bc in print () from /test/frysk_bin/frysk-core/frysk/pkglibdir/funit-hello #1 0x00000000004004eb in main () from /test/frysk_bin/frysk-core/frysk/pkglibdir/funit-hello #2 0x0000003eb641e074 in __libc_start_main () from /lib64/libc-2.7.so #3 0x0000000000400409 in _start () from /test/frysk_bin/frysk-core/frysk/pkglibdir/funit-hello
fdebuginfo and fdebugrpm – These utilities are so close in functionality we’ll look at them together. fdebuginfo will print out the debuginfo known to be installed on a system for a pid, executable or corefile:
fdebuginfo 24325 /bin/sleep --- /lib64/ld-2.7.so --- /lib64/librt-2.7.so --- /lib64/libc-2.7.so --- /lib64/libpthread-2.7.so ---
And fdebugrpm will install that debuginfo and dependencies:
fdebugrpm 24325 Missing Debuginfo package(s) ============================ coreutils-debuginfo-6.9-13.fc8 glibc-debuginfo-2.7-2 Do you wish to install the above packages? [y/n]: y --> Running transaction check ---> Package glibc-debuginfo.x86_64 0:2.7-2 set to be updated ---> Package coreutils-debuginfo.x86_64 0:6.9-13.fc8 set to be updated --> Finished Dependency Resolution Dependencies Resolved ============================================================================= Package Arch Version Repository Size ============================================================================= Installing: coreutils-debuginfo x86_64 6.9-13.fc8 updates-debuginfo 3.5 M glibc-debuginfo x86_64 2.7-2 fedora-debuginfo 20 M Transaction Summary ============================================================================= Install 2 Package(s) Update 0 Package(s) Remove 0 Package(s) Total download size: 24 M Is this ok [y/N]: y
Running fdebuginfo again after fdebugrpm:
fdebuginfo 24325 /bin/sleep /usr/lib/debug/bin/sleep.debug /lib64/ld-2.7.so /usr/lib/debug/lib64/ld-2.7.so.debug /lib64/librt-2.7.so /usr/lib/debug/lib64/librt-2.7.so.debug /lib64/libc-2.7.so /usr/lib/debug/lib64/libc-2.7.so.debug /lib64/libpthread-2.7.so /usr/lib/debug/lib64/libpthread-2.7.so.debug
As I hinted at earlier, both these commands work with executables, pids, and corefiles. so:
fdebuginfo core.24325 /bin/sleep
is just as valid.
Hope you enjoyed part 1.