CSE403 Software Engineering, Autumn 1999
Lecture #21 Notes
The requirements document might specify certain performance goals
Usually some qualitative or quantitative metric based on time or space
Does the system run in a given amount of time on a given amount of hardware?
Competing software companies use unique feature sets to differentiate themselves, but they also use various performance benchmarks to help sell their product
The most common performance metric is the time needed to complete a given task. But even then time can be measured different ways
- Wall clock time. What was the overall time needed to run a particular program or solve a problem?
- CPU usage. How much user and/or kernel CPU time was spend running the program?
- Average response. For example in a database application what is the average response time for a query?
- Longest response. Again for a database application what is the longest response time for a query?
Time is not the only performance metric. Other metrics can be
- Memory usage or utilization. What is the applications working set size?
- Disk space utilization. How much disk space does the application need to use for its program and data?
- I/O usage. How much I/O is the application doing and is it distributed over various devices?
- Network throughput. How many packets are being sent between machines?
- Client capacity. What is the maximum number of clients or users that an application can support?
Commercial benchmark suites, often used by magazines, compare competing systems
- Ziff-Davis Benchmark Operation (ZDBOp) is a large developer to benchmark software, and also runs test
- Benchmarks try and simulate actual loads, not always successfully
- With applications such as word processors, spreadsheets, and databases they typically run scripts and measure the performance
- For network application they set up entire labs of servers and clients on a private network.
- Software developers often focus on these benchmarks, to the determent sometimes of the real product.
- Major software venders are even invited to help ZDBOp tune the system to run the benchmark properly and fix bugs
- There is a funny story with Windows CopyFile performance concerning benchmarks
The academic and research community also have their own genre of benchmarks. For example, some old timers favored solving the towers of Hanoi problem, 8 Queens chess problem, or compiling and running a set of Unix sources as a measure of performance.
Side note: many benchmarks miss the fact that performance can degrade over an extended period of time. Some try and account for degradation but not always successfully. A good example is disk fragmentation where we can allocate sectors really fast if we ignore long term fragmentation issues. This is sort of like “pay me now or pay me later.”
In hardware there is CPU speed, memory size, network speed, cache size, disk size and speed, etc.
In software its in the basic algorithms, data layout, and various things that we can do in software to exploit the hardware. Think of the software goal as helping the hardware run more efficiently.
Buying faster and bigger machines should improve performance but this is not always practical
From a software viewpoint we always need to examine the algorithmic complexity of the system. Some examples are:
- When sorting data it is important to pick an algorithm appropriate for the data items and keys (Sometimes insertion sort is the right choice)
- Avoid redoing complicated calculations multiple times. This is where you weigh the cost of storing an answer versus redoing the calculation
- Avoid unnecessary overhead such as maintaining extra pointers linked data structures. Note that this item can go against making a product extensible, maintainable, and easily debug-able.
- Are there other examples?
Picking the right implementation language and constructs within the language also has an impact on performance.
But beyond algorithmic complexity the single most important thing you can do to increase performance is properly layout your data structures. Grouping commonly accessed data item close together typically increases both hardware and software cache efficiency (i.e., a smaller working set). Note however, that this might go against the logically design of the system. Also on a MP system we have to worry about cache line tearing.
Other software things to consider include:
- Function call overhead
- Process and thread switching overhead
- Kernel call overhead
- Disk and other I/O bottlenecks
- Lock contention, especially on MP systems
Percent of CPU usage and idle times are sometimes a measure of system efficiency. This metric is pretty much ignores speed and is more concerned with why the processor is idling. Resolving page faults and lock contention are two of the big bug-a-boos here.