Cycle Counts and Instruction Counts

CGProf provides two types of profiles: the cycle count profile and the instruction count profile. We suggest using the cycle count profile in most situations, because of the additional information it provides. However, there are important tradeoffs for the two types of profiles.

The advantage of instruction counts is that they are repeatable and that they are not influenced by the overhead of profiling or other activity on the system. A limitation of instruction count profiling occurs because the time required for a single instruction varies widely depending on the instruction and on memory system effects. These timing variations limit the conclusions that can be drawn from an instruction count profile. As a specific example, two procedures that execute the same number of instructions may not take the same amount of time, due to timing variations between instructions and memory system effects. For this reason, the instruction count profile can be difficult to interpret. Another limitation of the instruction count report is that it does not provide information for DLL calls into uninstrumented modules, since the instruction counts are based on basic block instrumentation points rather than a hardware event counter.

Profiling with cycle counts has the advantages that it incorporates cache misses and other timing effects, and that it is not subject to the interpretation problems that arise when profiling with instruction counts. Cycle count profiles collected with CGProf incorporate instruction timings, memory system effects, and time spent in system DLLs, giving a complete and accurate view of performance. Cycle count profiles include all time on the system, including idle time and time spent waiting for I/O or network packets (See Limitations). If your application spends a substantial amount of time in the idle loop, you will probably get more useful information from an instruction-count profile. (Note: Cycle count profiles can be generated for processors that support the RDTSC instruction. This includes the Intel Pentium and all later Intel processors.)