dtruss_example.txt   [plain text]


The following demonstrates the dtruss command - a DTrace version of truss.
This version is designed to be less intrusive and safer than running truss.

dtruss has many options. Here is the help for version 0.70,
 
   USAGE: dtruss [-acdefholL] [-t syscall] { -p PID | -n name | command }

             -p PID          # examine this PID
             -n name         # examine this process name
             -t syscall      # examine this syscall only
             -a              # print all details
             -c              # print syscall counts
             -d              # print relative times (us)
             -e              # print elapsed times (us)
             -f              # follow children
             -l              # force printing pid/lwpid
             -o              # print on cpu times
             -L              # don't print pid/lwpid
             -b bufsize      # dynamic variable buf size
      eg,
          dtruss df -h       # run and examine "df -h"
          dtruss -p 1871     # examine PID 1871
          dtruss -n tar      # examine all processes called "tar"
          dtruss -f test.sh  # run test.sh and follow children



For example, here we dtruss any process with the name "ksh" - the Korn shell,
 
 # dtruss -n ksh
 PID/LWP   SYSCALL(args)                  = return
 27547/1:  llseek(0x3F, 0xE4E, 0x0)               = 3662 0
 27547/1:  read(0x3F, "\0", 0x400)                = 0 0
 27547/1:  llseek(0x3F, 0x0, 0x0)                 = 3662 0
 27547/1:  write(0x3F, "ls -l\n\0", 0x8)          = 8 0
 27547/1:  fdsync(0x3F, 0x10, 0xFEC1D444)                 = 0 0
 27547/1:  lwp_sigmask(0x3, 0x20000, 0x0)                 = 0xFFBFFEFF 0
 27547/1:  stat64("/usr/bin/ls\0", 0x8047A00, 0xFEC1D444)                 = 0 0
 27547/1:  lwp_sigmask(0x3, 0x0, 0x0)             = 0xFFBFFEFF 0
 [...]

The output for each system call does not yet evaluate as much as truss does.



In the following example, syscall elapsed and overhead times are measured.
Elapsed times represent the time from syscall start to finish; overhead
times measure the time spent on the CPU,

 # dtruss -eon bash
 PID/LWP    ELAPSD    CPU SYSCALL(args)           = return
  3911/1:       41     26 write(0x2, "l\0", 0x1)          = 1 0
  3911/1:  1001579     43 read(0x0, "s\0", 0x1)           = 1 0
  3911/1:       38     26 write(0x2, "s\0", 0x1)          = 1 0
  3911/1:  1019129     43 read(0x0, " \001\0", 0x1)               = 1 0
  3911/1:       38     26 write(0x2, " \0", 0x1)          = 1 0
  3911/1:   998533     43 read(0x0, "-\0", 0x1)           = 1 0
  3911/1:       38     26 write(0x2, "-\001\0", 0x1)              = 1 0
  3911/1:  1094323     42 read(0x0, "l\0", 0x1)           = 1 0
  3911/1:       39     27 write(0x2, "l\001\0", 0x1)              = 1 0
  3911/1:  1210496     44 read(0x0, "\r\0", 0x1)          = 1 0
  3911/1:       40     28 write(0x2, "\n\001\0", 0x1)             = 1 0
  3911/1:        9      1 lwp_sigmask(0x3, 0x2, 0x0)              = 0xFFBFFEFF 0
  3911/1:       70     63 ioctl(0x0, 0x540F, 0x80F6D00)           = 0 0

A bash command was in another window, where the "ls -l" command was being
typed. The keystrokes can be seen above, along with the long elapsed times
(keystroke delays), and short overhead times (as the bash process blocks
on the read and leaves the CPU). 



Now dtruss is put to the test. Here we truss a test program that runs several
hundred smaller programs, which in turn generate thousands of system calls.

First, as a "control" we run the program without a truss or dtruss running,

 # time ./test
 real    0m38.508s
 user    0m5.299s
 sys     0m25.668s

Now we try truss,

 # time truss ./test 2> /dev/null
 real    0m41.281s
 user    0m0.558s
 sys     0m1.351s

Now we try dtruss,

 # time dtruss ./test 2> /dev/null
 real    0m46.226s
 user    0m6.771s
 sys     0m31.703s

In the above test, truss slowed the program from 38 seconds to 41. dtruss
slowed the program from 38 seconds to 46, slightly slower that truss...

Now we try follow mode "-f". The test program does run several hundred
smaller programs, so now there are plenty more system calls to track,

 # time truss -f ./test 2> /dev/null
 real    2m28.317s
 user    0m0.893s
 sys     0m3.527s

Now we try dtruss,

 # time dtruss -f ./test 2> /dev/null
 real    0m56.179s
 user    0m10.040s
 sys     0m38.185s

Wow, the difference is huge! truss slows the program from 38 to 148 seconds; 
but dtruss has only slowed the program from 38 to 56 seconds.