CSE 331 / Section 6

Contents:

Performance
Module Dependence Diagrams
Design Patterns
Exercises

Note: There is more content in these section notes than your TA has time to cover during section. These notes are intended to supplement the other course materials such as lecture notes and textbook readings.

Performance

Many novice programmers (or even moderately-good programmers who learned to program in the 1970's) are somehow obsessed with performance and subconsciously aim to write code that may be a bit more convoluted but is faster ... or at least they think it runs faster. Almost 100% of the time, though, the tweaks that novices make result in no visible performance gains, for these reasons (and more):

They don't realize that the compiler can already make fairly sophisticated optimizations from normal, clean-looking source code.
They don't actually know what parts of their program are the slowest; it makes no sense to optimize a part that isn't taking up much time.
They don't realize that small statement-level tweaks rarely result in visible performance improvements on modern computers.

Correctness and code clarity are much more important

The main focus of CSE 331 (and of most real-world software engineering) is on building reliable, maintainable, robust software systems, not squeezing out every ounce of performance possible. Correctness comes first. Remember, computers double in speed every 18 months, but humans don't ever double in speed. Programmer time is much more valuable than machine time. If you have to make a piece of code more complicated and thus more likely to be buggy just to (hopefully) get some performance gains, then it's not worth it. A general rule of thumb is to never make a design decision that makes your code more complicated for the sole purpose of improving run-time performance.

Only optimize after profiling

The only time you should think about optimizations is after you've already built up a working system along with a test suite. Then you can profile your system to see where it's spending most of its time and direct your optimization efforts to those areas. You will have the advantage of both having an already-working version and also a test suite to serve as your safety net.

It makes no sense to start optimizing a program without first profiling it to see where it's spending significant portions of time. You probably don't know what parts of your code are the slowest, because you are not a computer.

For example, if you are convinced that a certain method is really inefficient and spend many hours trying to optimize it, but it only accounts for 2% of your system's run time, then at most your system will speed up by 2%, a trivial amount, if you somehow magically get that method's runtime down to 0. The lesson is that, if something doesn't take up a significant fraction of time, there is probably no point in optimizing it.

Design for good asymptotic performance when it truly matters

The most noticeable performance slowdowns are ones where an algorithm has too large of an order of growth and the input is sufficiently large. For example, if you try to implement a mapping of keys to values using a pair of Lists, then it takes linear time to look up a key-value pair. For small lists, there will be no noticeable slowdown. However, when there are 1,000,000 key-value pairs in your map, then your program will slow to a crawl. A better choice of data structures would be a HashMap, which uses hashing to achieve constant-time lookup.

The lesson here is to design data structures and algorithms with good asymptotic performance whenever you expect to have sufficiently large inputs. This is a much better use of your time than tweaking lines of code or making other local optimizations, because your data structures and core algorithms will likely be used throughout many parts of your system, so if they are slow, then your system will slow down significantly. Imagine if iterating through a List took exponential time; then basically every Java program in existence would never run in a reasonable amount of time since list traversals occur almost everywhere.

Module Dependence Diagrams

At a high level, a module A is said to depend on module B if a change to the specification of B might cause a change in A.

MDDs:

Nodes represent modules or abstractions; nodes are labelled with the name of the abstraction.
Directed edge from A to B indicates A depends on B.
Weak dependencies are represented by a dotted-line edge between the modules.

A weak dependency arises when a module mentions another by name but doesn't actually make any assumptions about it (doesn't depend on the specification). An example is a class that has a method that takes an argument of another class, and just passes the object of that class to a method of a third class.

Exercise: Have students draw MDD for their PS2 system and talk about it. The most likely answer is probably:

  Route --> GeoFeature --> GeoSegment --> GeoPoint
    |                          ^
    \                          /
     --------------------------

Question: In general, why is it useful to have fewer edges between the modules of a software system?

Answer: There are many reasons, including robustness, clearer abstraction boundaries, etc.

Design Patterns

Design patterns are discussed at greater length in the book by the same title (Gamma, Helm, Johnson, Vlissides). A design pattern is:

a standard solution to a common programming problem
a high-level programming idiom
a shorthand for describing aspects of program organization

It is important to remember that design patterns should only be introduced into your design or implementation after you have noticed a problem. After you have determined that there is a problem, and investigated the source of the problem, looking at a design patterns reference will often help you to solve your problem without having to reinvent the wheel.

Here are examples of several design patterns, but is by no means an exhaustive list (your TA will probably not be able to cover all of these in section):

Flyweight

The Flyweight pattern uses sharing to support a large number of fine-grained objects efficiently. This is done by encapsulating the immutable and context-independent pieces of state in one class (called the flyweight). Since the state in this class is context-independent, the same instance can be shared in many places. Then only the context-dependent or mutable data needs to be stored separately for each real instance of the object. The part of the state stored within the flyweight is called intrinsic state. Extrinsic state varies with the flyweight's context, so it cannot be shared. The extrinsic state must be passed into the flyweight by the client when it is needed.

The Flyweight pattern should be used when:

a large number of objects are used
most object state can be made intrinsic
many groups of objects can be replaced by few shared objects once extrinsic state is removed
the application doesn't depend on object identity

One use of a flyweight pattern is to efficiently store character bitmaps in a word processor application. Without the use of a flyweight, a Character class will likely contain fields holding the location of the character on the screen, the kind of character (e.g., 'A' or 'k'), and a bitmap representation of the character. For a decent-sized document with 100,000 characters, 100,000 instances of the Character class would need to be instantiated, each holding a bitmap representation. However, there are only a small number of unique characters (i.g., A-Z, a-z, 0-9, special symbols, space), so each instance of Character could share this intrinsic state, which consists of the bitmap of the character in memory. Thus, each Character instance only needs to keep extrinsic state, which is its location on the screen, and can share intrinsic state with all other Character instances which represent the same character on-screen.

Observer

The Observer pattern was covered in lecture. It allows one or more objects (the observers) to watch another (the subject). The Observer pattern allows the subject and observer to form a publish-subscribe relationship. Through the Observer pattern, observers can register to receive events from the subject. When the subject needs to inform its observers of an event, it simply sends the event to each observer. The MDD of the Observer pattern is show below:

Question: Besides the example for Stocks given in lecture, what's another scenario where the Observer pattern makes sense?

One possible Answer: Consider a spreadsheet system that includes embedded graphs that view the data contained in the cells. In this system, as a user changes the data in the cells, the graph should be notified of the change and update its diagram appropriately. The subject is the data model and the observers are the embedded graphs.

Question: What is the benefit of using the Observer pattern?

Answer: It decouples the observer from the subject. The subject doesn't need to know anything special about its observers. Instead, the subject simply allows observers to subscribe. When the subject generates an event, it simply passes it to each of its observers.

Example interfaces for the Observer pattern follow:

public interface Subject {
      public void addObserver( Observer o );
      public void removeObserver( Observer o );
}

public interface Observer {
      public void update( Subject o );
}

In the code above, the Subject interface defines the methods that a Subject must implement in order for Observers to add and remove themselves from the Subject. The Observer interface (above) lists the methods that an Observer must implement so that a Subject can send an update notification to the Observer.

Examples of the Observer pattern are found in the Java event model and in the JMS (Java Messaging Service) framework. The Observer pattern is sometimes called "Pub-Sub" in industry.

Listeners also use this pattern.

Factory Method

A factory method is a method that manufactures objects of a particular type. The intent of this pattern is to define an interface for creating an object which lets subclasses decide which classes to instantiate.

Consider a document-processing application. We might want to define abstract classes Document and Application to represent documents and the applications that handle the documents. There may be some functionality common to all types of applications, except that they rely on the creation of new documents of the appropriate type. For instance, consider a method newDocument, which creates a new document, adds it to the application's list of documents, and opens the document. We could write this as:

public void newDocument() {
   Document doc = createDocument();
   docs.add(doc);
   doc.open();
}

where createDocument is a factory method that the concrete subclasses of Application must implement, that returns a new document of the correct type. Let's say we wanted to implement an application to create and edit resumes, called ResumeWizard, which handles documents of type Resume.

In Application, createDocument would be declared abstract as:

public Document createDocument();

and in ResumeWizard, it could be implemented as:

public Document createDocument() {
   return new Resume();
}

Using this factory method allows us to put the reusable code of newDocument in the superclass, even though it needs to create new objects whose types depend on the subclass.

In general, the Factory Method pattern can be used when a class can't anticipate what class it needs to create or when a class wants its subclasses to specify the types of objects to create.

In GUI programming, the Factory pattern can be very useful to easily change the "look-and-feel".

Adapter

The purpose of the Adapter pattern is to convert the interface of a class into another interface that clients expect. This can be necessary if you want to reuse code, but that code does not satisfy a necessary interface.

Example: Suppose you have written code that uses objects that satisfy the specification for Rectangle:

interface Rectangle {
  /** @effects grow or shrink this by the given factor **/
  void scale(float factor);

  // other operations
  float area();
  float getWidth();
  float getHeight();
  ...
}

Now, suppose that you have an already implemented class NonScaleableRectangle, which does not have the scale method, but has setHeight and setWidth methods, and has all of the other methods in Rectangle. If we want to use NonScaleableRectangle to satisfy the Rectangle interface, we can write an adapter. We can do this using either subclassing or composition/delegation. The solution using subclassing would look like:

class SubclassingScaleableRectangle extends NonScaleableRectangle 
    implements Rectangle {
  void scale(float factor) {
    setWidth(factor*getWidth());
    setHeight(factor*getHeight());
  }
}

A solution using delegation would look like:

class DelegatingScaleableRectangle implements Rectangle {
  NonScaleableRectangle r;

  DelegatingScaleableRectangle(width w, height h) {
    r = new NonScaleableRectangle(w,h);
  }

  void scale(float factor) {
    r.setWidth(factor*getWidth());
    r.setHeight(factor*getHeight());
  }

  float area() {
    return r.area();
  }

  float getHeight() {
    return r.getHeight();
  }

  ...
}

Question: What are the advantages and disadvantages of using delegation versus subclassing?

Answer: One feature of using subclassing is that all of the methods of the superclass will be inherited. So if the superclass contains additional methods beyond the spec that you need to meet, and these methods are useful, you get these methods for free. However, in some cases, you might not want the ADT to have these methods, which means that you would have to override the methods with exceptions in the subclass. In such a case, delegation is a cleaner solution.

Note: it's easy to break things if the subclass is not a true subtype. In that case, any of the non-overridden methods in the superclass that make calls into the subclass may have unexpected results.

To summarize, the Adapter pattern is useful when you want to use an existing class, but its interface does not match the interface that you need.

Model-View-Controller

This is relevant given that they will be constructing a GUI soon and will want to separate the GUI aspect from the controller of their backend. In particular, things like input and output should be abstracted so that GUI/commandline switches can be made. This can be shown in a MDD.

Exercises

Problem 1: Student-Tracking Database

You have been enlisted to help the CSE 331 TAs design a student-tracking database. Unfortunately, the TAs don't practice what they preach, and they've written no specs, only giving you a skeletal outline. You're just going to have to make educated guesses about how these classes should work and what these methods do.

class Roster {
    Roster();
    void addSection(Section s);
    Section lookupSection(int sectionNumber);
    Student lookupStudent(String username);
}

class Section {
    Section(int sectionNumber);
    int getSectionNumber();
    void addStudent(Student s);
    Student lookupStudent(String username);
}

class Student {
    Student(String name);
    String getName();
}

Not trusting the TAs, you decide to analyze the quality of this design by constructing a module dependency diagram.

Construct a minimal MDD involving the 4 classes referenced above: Roster, Section, Student, and String. Ignore all self-dependencies and dependencies based on inheritance. Think about how the methods would be implemented and consider only those dependencies which are necessary.

Problem 2: GUI Factories

Often using the Factory pattern can simplify GUI programming. Consider the following code that creates several buttons and places them on a screen.

JPanel panel = new JPanel();
JButton button1 = new JButton("OK");
JButton button2 = new JButton("Cancel");

button1.setBorderPainted(true);
button1.setBackground(Color.blue);
button1.setBorder(BorderFactory.createLineBorder(Color.black));
button1.setFont(new Font("Serif", Font.PLAIN, 14));

button2.setBorderPainted(true);
button2.setBackground(Color.blue);
button2.setBorder(BorderFactory.createLineBorder(Color.black));
button2.setFont(new Font("Serif", Font.PLAIN, 14));

panel.add(button1);
panel.add(button2);

Suggest how a Factory pattern can be used to make this UI easier to modify.

ANSWER:

Imagine that you need to change the format of both buttons. You'd need to change the code in multiple places (BAD!). We can create a ButtonFactory instead:

JPanel panel = new JPanel();
JButton button1 = ButtonFactory.createButton("OK");
JButton button2 = ButtonFactory.createButton("Cancel");
panel.add(button1);
panel.add(button2);

The code for the ButtonFactory method is given below:

public static JButton createButton(String label) {
  JButton button = new JButton(label);
  button.setBorderPainted(true);
  button.setBackground(Color.blue);
  button.setBorder(BorderFactory.createLineBorder(Color.black));
  button.setFont(new Font("Serif", Font.PLAIN, 14));
}

Problem 3: The UW Card Office

This problem deals with a small system to represent the UW Card Office. The fundamental role of the card office (at least in our system) is to produce new cards for people (if, for instance, a person loses his card). We have provided you with code/specs for a piece of this system, which is designed as follows.

A Card represents the ID card for an UWPerson, which has several attributes, such as the name and ID number of the holder, the color of the card, the label on the card for the type of holder, and the access rights that have been programmed into the card. Access rights are represented by AccessPrivilege objects, which might encapsulate a location and times of day that the person may access that location (although the actual spec of this class has not been included). The PersonnelOffice contains static methods to retrieve information about people at UW, such as their addresses and what buildings they may access. The CardOffice class represents the card office, which can create new cards for people. It currently can handle card creation for students and faculty members. The FactoryCardOffice is another class to represent the card office, but which uses uses CardFactory objects to create new cards for people. There are two types of CardFactorys in the current system — StudentCardFactory, which creates cards for students; and FacultyCardFactory, which creates cards for faculty members.

Draw an MDD for each of the card office implementations. What are the advantages of using the Factory pattern in this system?

We want to extend the system to generate cards for administrative staff, whose cards are grey and are labelled "Staff". Administrative staff has access to only the building where they work, which can be obtained from the personnel office. Extend the system that uses the Factory pattern to handle card creation for administrative staff. Discuss how you would modify the system that does not use the Factory pattern, and how the modifications to the two systems compare.

Section 6: Performance, MDDs, and Design Patterns