A look at ftrace, or how to write your own Frysk utility

It’s true that Frysk will have a UI that will wrap and encompass its functionality. But there is no reason that a set of other Frysk based utilities cannot exist either. A set of tightly focused utilities would complement the UI, and expand the functionality of Frysk. Personally, I’ve always envisioned Frysk to be a suite of tools, rather than one monolithic tool.

This suite concept is important, because all Frysk tools will have a common architecture. Utility Foo will have the same common architecture as Utility Bar, which uses the same architecture that Frysk UI is built on. Because of this core-based approach,
resulting tools will be lightweight, and have a much smaller code-base than “from-scratch” utilities. This was a very conscious design decision by the architects of Frysk; primarily, so we don’t reinvent the world each time we we write a tool. So on that subject, I’d like to explore a Frysk based utility that was recently written.

Recently, there has been a lot of work done on the System Call Observer in Frysk. A System Call Observer is – as its name implies – a tool for watching entering and exiting system calls in a task. So the system call observer code was written. During that time, another standalone tool was also written called ftrace. This tool was written to use the functionality of the system call observer in a standalone environment, which has clear and existing need in the system-tool world.

ftrace allows you to pass a binary file as an argument. When run, ftrace loads and executes that binary, and watches system calls that the binary calls. When it sees the call, it prints them to the console. Sound familiar? In many ways it is the same as strace. It’s a new tool and sure to grow, but the entire codebase for the ftrace utility itself 107 lines. Why so small? Because most of the heavy lifting is done in the Frysk Core.

Before we dive in and have a look, a lot of this post covers material explored during the last week. So have a quick read back, if you missed the other posts. Finally, you can find the full ftrace code in the namespace:

/frysk-core/frysk/bindir/ftrace.java
and
/frysk-core/frysk/util/Ftrace.java

or here are links to the project webcvs view: ftrace.java and FTrace.java

How does ftrace start? It’s a pretty standard Java program, with a main function, just like you have seen many times before. It also setups a PrintWriter, and does a brief argument sanity check


public static void main (String[] args)
{
    final PrintWriter out = new PrintWriter(System.out, true);
    if (args.length == 0)
      {
        out.println("Usage: ftrace pid...>");
        return;
      }

Pretty vanilla so far. The two really interesting things here, are the two Frysk calls that comprise the rest of the main function:


   Manager.host.requestCreateAttachedProc(args, new AttachedObserver());
   Manager.eventLoop.start();
}

And that is the main function. Looking back at those Frysk calls, the first call takes the arguments from main (in this case, the name of the program that was passed to ftrace, stored in args). It then calls the requestCreateAttachedProc function which asks Frysk to: create the process specified, execute the process, and then attach to that process. The second argument to the requestCreateAttachedProc function attaches the given observer (in this case, AttachedObserver which is defined later on in ftrace) to that process.

The second function call: Manager.eventLoop.start() simply starts the eventLoop (as discussed in previous posts).

Lets take a look at AttachedObserver that was passed as an argument to the requestCreateAttachedProc function call. It is defined later on in ftrace.java:


private static class AttachedObserver implements TaskObserver.Attached{
    public Action updateAttached (Task task)
    {
      task.requestAddSyscallObserver(new SyscallObserver());
      setProc(task.getProc());
      task.requestUnblock(this);
      return Action.BLOCK;
    }

    public void addFailed (Object observable, Throwable w){
      throw new RuntimeException("Failed to attach to created process", w);
    }

  }

I’ve deleted addedTo, deletedFrom for sake of brevity (as they were not implemented). We can see that AttachedObserver implements the TaskObserver.Attached interface. The important function here is updateAttached. This function is called when the Task (in our case the mainTask) has been attached by Frysk. As requestCreateAttachedProc() actually attaches to the mainTask of the given process, this observer is called when requestCreateAttachedProc() has completed its work.

If we look in updateAttached(), the first thing it does is add a System Call observer called SyscallObserver. This is defined later on in ftrace.java. For now, we’ll stay and continue with AttachedObserver. It then calls the setProc function that adds a rudimentary ProcDestroyed observer. This observer – when notified of a destruction event – will stop the event loop and exit. Why do this? Because if the process we have attached too has been destroyed, then it is completed; and our syscall tracing is done.

Time to take stock, and recap so far:

  • args is passed to Manager.host.requestCreateAttachedProc() which denotes the process to be run. So for example: ftrace /bin/uname, would create, attach and run /bin/uname.
  • We add the AttachedObserver as a parameter to requestCreateAttachedProc(), so that when the process is loaded, prepared, and attached too by requestCreateAttachedProc(), this observer will be notified.
  • In AttachedObserver, we add a new System Call observer to the main task of the given process. We also set up a Proc Destroyed Observer that kills the event loop and exits when the given process exits.

In essence, all of the code this far into the article was just setup, set observers, and tear down code. Now that the process is running, and we know when to exit, lets take a look at the actual System Call Observer. This forms the core of ftrace, and is notified whenever the task it is attached to enters or exits a system call.


   private static class SyscallObserver implements TaskObserver.Syscall{

    public Action updateSyscallEnter (Task task)
    {
      SyscallEventInfo syscallEventInfo;
      try {
          syscallEventInfo = task.getSyscallEventInfo ();
      }  catch (Task.TaskException e) {
          throw new RuntimeException("Failed with task exception: ", e);
          return Action.CONTINUE;
        }
      frysk.proc.Syscall syscall = frysk.proc.Syscall.syscallByNum(syscallEventInfo.number(task));
      PrintWriter printWriter = new PrintWriter(System.out);
      printWriter.print(task.getProc().getPid() + "." + task.getTid() + " ");
      syscall.printCall(printWriter, task, syscallEventInfo);
      printWriter.flush();
      return Action.CONTINUE;
    }

    public Action updateSyscallExit (Task task)
    {
      SyscallEventInfo syscallEventInfo;
      try {
          syscallEventInfo = task.getSyscallEventInfo ();
      } catch (Task.TaskException e) {
          throw new RuntimeException("Failed with task exception: ", e);
          return Action.CONTINUE;
      }
      frysk.proc.Syscall syscall = frysk.proc.Syscall.syscallByNum(syscallEventInfo.number(task));
      PrintWriter printWriter = new PrintWriter(System.out);
      syscall.printReturn(printWriter, task, syscallEventInfo);
      printWriter.flush();
      return Action.CONTINUE;
    }
  }

I deleted addedTo, addFailed, and deletedFrom for the purposes of brevity. When we add this observer to a given process, all system calls that happen inside that process (or the main task of that process) are captured here. The system call observer implements the TaskObserver.Syscall interface. The really interesting function interfaces here are:

  • updateSyscallEnter – called when a task is entering a system call.
  • updateSyscallExit – called when a task is exiting a system call

The actual implementations of these two functions in ftrace are very similar. It can be defined as get the system call, interrogate the system call, print the results to the console.

Lets take a look at the first function: updateSysCallEnter. The first thing we do is get the System Call info:

 syscallEventInfo = task.getSyscallEventInfo ();

and store it in a special class (SyscallEventInfo) the Frysk Core has defined for holding such information.

The task based api (task.getSyscallEventInfo()) populates the syscallEventInfo class instance. Remember that we get the task the system call occurs in via the observer. Also, this is the entrance to the system call. Now we have the data of the system call from the task, we can look at the next line of code:

frysk.proc.Syscall syscall = frysk.proc.Syscall.syscallByNum(syscallEventInfo.number(task));

This function finds the actual system call. The syscall variable is declared as another special use class in Frysk Core, and it holds static system call information. Because that class is static, we call the lookup directly with:

 frysk.proc.Syscall.syscallByNum(syscallEventInfo.number(task)); 

And this extracts the system call in question. The next three lines format the system call for console output:


      PrintWriter printWriter = new PrintWriter(System.out);
      printWriter.print(task.getProc().getPid() + "." + task.getTid() + " ");
      syscall.printCall(printWriter, task, syscallEventInfo);

and the syscall printcall does the actual hard work of formatting the system call output.

If we look at the the updateSyscallExit function implementation, it is almost exactly the same as the updateSyscallEnter. The only difference is we are capturing the return of that system call, not the entry.

And so that was a brief look at a very short, yet useful, Frysk utility. There is only so much we can cover in a blog post, and I’ve not gone into a huge amount of detail. But what I wanted to highlight was the common architecture, and how it allows us to abstract away the details using Frysk Core’s rich api, and let us worry about implementing our utility. What ftrace does to capture system calls, the UI will do pretty much the same too. And any other utility in the future.

Many thanks to Sami Wagiaalla who wrote ftrace, and large amounts of the system call observer for discussing this article with me. And to Andrew Cagney for pointing out the common architecure advantages.

Happy Hacking!

One single comment

  1. shophgh.com says:

    I do accept as true with all the ideas you’ve presented on your post. They’re really convincing and will definitely work. Still, the posts are too brief for beginners. Could you please prolong them a bit from subsequent time? Thank you for the post.

Post a comment

Copyright © Phil Muldoon

Built on Notes Blog Core
Powered by WordPress