Creating Order files, i.e., Scatter Loading the Compiler -------------------------------------------------------- This is a brief description on how to generate the cc1*.order files. The order files are intended to minimize the number of page-ins of the compiler as it is loaded. If there is enough memory this benifits only the first load of the compiler since it will stay resident after that. Unfortunately it's a manual process since one of the tools requires an explicit interrupt from the terminal. You should only need to (re)do the order files if there's any major reorganizations or additions to the compilers. There are five steps involved to genrrate the order files. 1. Select test cases. These should be "average" compilations to exercise each of the cc1* compilers. They should be large enough to take enough time to generate acceptible results. As of this writing the following cases were chosen: For cc1 - gcc/c-decl.c (from the compiler sources) For cc1plus - Finder_FE/AboutWindow/AboutWindow.cp For cc1obj - MailViewer/Compose.subproj/MessageEditor.m For cc1objplus - devkit/cpp.subproj/Cpp-precomp.mm Of these four the cc1objplus test case is not very good. Unfortunately there are few .mm files of any significant size. If a better one can be found it probably should be used. 2. Capture the command lines needed to build the chosen files. For the selected projects built with with PB set PB's building preferences for detailed build logs. That way you can the full command lines you need. In non-PB projects like gcc the command lines are of course echoed on the terminal. 3. Run selected command lines with -### to get the cc1* lines. From the full command lines you need the cc1* lines generated by the driver. The easiest way to get these is to add -### to the full command lines captured in step 2. 4. Prepare to generate the order files If you don't already have it you should build a set of cc1* compilers with -O2 with symbols. The easiest way to do this is FSF-style but using buildit with build_gcc probably will also work. In the gcc objects directory you will of course have the cc1* compilers. You need to substitute these in each of the cc1* command lines captured in step 3. You also need to run these with ~perf/bin/pcsample to generate the order files. Thus, for each cc1* command line it should have the following in the beginning in place of the original cc1* of the step 3 lines. sudo ~perf/bin/pcsample -O -E $gcc3-obj/cc1* ... rest of line... Where $gcc3-obj represents whatever the path is to the gcc3 objects directory and cc1* is one of the cc1* compilers (is -B necessary here?). Note, you need sudo because pcsample will only run as root. Also, if you have a dual processor you need to reboot as in single processor mode. If you don't pcscample will tell you to do that by executing the command, nvram boot-args="cpus=1" 5. Generate the order files Run the lines created in step 4. The order files (cc1*.order) will be left in /tmp in the direectory indicated by the summary that pcsample displays when you hit cmd-c to stop the pcsample execution. Be sure to run pcsample long enough to compile the entire program. At this point you now have the order files created. You place them in the order-files directory at the top level of the gcc source directory. You can also use them to measure he effects of these order files on compiler page-ins. If you do this go to the next step (6). Otherwisw you are ready to go. 6. Creating the compilers with ther order files You will need two versions of the cc1* files; the ones from above and a set linked with their respective order file. From a gcc compiler build extract the command lines that link the cc1* files. Change the -o file to something else, for example, cc1 to cc1X. Then add the following options to the link line. Note, if you build using buildit and build_gcc the lines will already be there referencing the order-files directory. Otherwise you need to add, -sectorder __TEXT __text $order/cc1*.order -e start where $order is the directory containing the order file being used and cc1* is of course a reference to a specific order file. 7. Measuring the performance improvement You need to have two terminbal windows open; T1 and T2. The execute the follwoing commands on the indicated terminals: T2: sudo fs_usage -w > /tmp/fs.out1 (do NOT execute yet) T1: ~perf/bin/flushmem (note this can take a while) T2: fs_usage T1: use cc1* line originally used to build order file T2: ctl-c when cc1* compilation done T2: sudo fs_usage -w > /tmp/fs.out2 (do not execute yet) T1: ~perf/bin/flushmem T2: fs_usage T1: use cc1*X line originally used to build order file T2: ctl-c when cc1XXX compilation done In the first group of commands you use the original cc1* line with the command line used to build the order file. You also run fs_usage at the same time to measure the paging behavior. The second group of commands is similar but you use the cc1* linked with its order file (with the -sectorder stuff mentioned in step 6). For this discussion call this compiler cc1*X. In both cases you need to run ~perf/bin/flushmem to make sure compiler is flused from the cached. That way you are measuring the initial page-in bechavor as thee compiler is loaded. Warning, the flushmem's sometimes take quite a while. At this point you should have /tmp/fs.out1 and /tmp/fs.out2. You need to extract the page-ins times for the compilers in order to sum them up. The easies way to do this is make the data tab delimited for importing into Excel. T1: fgrep cc1* /tmp/fs.out1 | fgrep PAGE_IN | \ tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-1 T1: fgrep cc1*X /tmp/fs.out2 | fgrep PAGE_IN | \ tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-2 Load each of these into Excel and sum the pagin times to determine the percent change. Remember that the '*'s in the above illustrations are not really a '*'. It is just a short way to show the general command lines where in reality you explicitly specify cc1, cc1plus, cc1obj, or cc1objplus. Also remember that you are measuring the page-in performance improvement on the test cases used to generate the order files in the first place. Thus you should expect that these would probably show the greatest improvement. That is why it is important to try to choose representative test cases in the first place. You can try the measurements on other tests. But that requires you again extracting the cc1* lines using -###. The above procedure only uses the orignal files since the command lines are already handy.