In class, we talked about the idea of the binary search algorithm, but did not end up implementing it. This reading will look at the implementation for the algorithm since it’s not uncommon to be asked to implement the code for this algorithm on an interview for a job involving programming. The end of the reading is a reference of the complexities of the operations we have learned so far

Review: Algorithm Idea

In class, we discussed how to search a sorted array for a value. The naive algorithm that just starts at the beginning and traverses to the end or until it finds the element. With the language we discussed in class, we would say that worst case this “linear search” would take O(n) time where n is the number of elements in the array.

We then discussed a better idea which divides the array in half repeatedly. Start by looking at the middle element of the array and compare it to the target value, there are three options

search(20)

[2, 7, 9, 9, 13, 14, 16, 20, 27, 36, 51]
|--------------|     |-----------------|
 less than mid    ^    greater than mid
                 mid
  • The middle element is the same as the target: you can return the index you found
  • The middle element is larger than the target: you can eliminate the right half of the array because the array is sorted and all elements to the right of the middle will be larger than the target
  • The middle element is smaller than the target: you can eliminate the left half of the array because the array is sorted and all elements to the left of the middle will be smaller than the target

In the latter two cases, we now only have half the array to search and can repeat the same process using the start and endpoint of the remaining half as the start and end of the range to search. We repeat this process until we find the target value at the mid-point or once the search range has no elements left.

Implementation

public static int binarySearch(int[] arr, int target) {
    // Start looking at whole array
    int low = 0;
    int high = arr.length - 1;

    // while low and high haven't met yet, there are still more indices to search
    while (low <= high) {
        // compute middle index and value
        int mid = low + (high - low) / 2;
        int midValue = arr[mid];

        if (midValue < target) {
            low = mid + 1;
        } else if (midValue > target) {
            high = mid - 1;
        } else {
            return mid; // value found!
        }
    }
    return -1; // value not found
}

Java’s implementation of binarySearch has the added benefit of not returning -1 if the value is not found, but rather indicates where the element “should go” if inserted. To do this it returns the negative of the index that the value should be inserted at.

return -(low + 1);

Trace of the algorithm running

Consider the example above

search(20)

values  [2, 7, 9, 9, 13, 14, 16, 20, 27, 36, 51]
indices  0, 1, 2, 3,  4,  5,  6,  7,  8,  9, 10

The following is a trace of all the decisions made by the algorithm

--------- Iteration 1 ---------
low = 0, high = 10
Is low <= high? Yes
    mid = 5
    midValue = 14

    Is midValue < target? Yes
        low = 6

--------- Iteration 2 ---------
low = 6, high = 10
Is low <= high? Yes
    mid = 8
    midValue = 27

    Is midValue < target? No
    Is midValue > target? Yes
        high = 7

--------- Iteration 3 ---------
low = 6, high = 7
Is low <= high? Yes
    mid = 6
    midValue = 16

    Is midValue < target? Yes
        low = 7

--------- Iteration 4 ---------
low = 7, high = 7
Is low <= high? Yes
    mid = 7
    midValue = 20

    Is midValue < target> No
    Is midValue > target? No
    Else, return 7

Which gets us the value we wanted. Another way to see this is to visualize the ranges being considered in the visualization of the same trace below

Complexity

Unlike the naive solution that searches every index, the binary search algorithm is much faster because it eliminates half the array on each iteration. As mentioned in class, this makes the complexity of the algorithm O(log(n)).

Data Structure Complexity

This section reviews the discussion of the complexities of the operations on ArrayList and LinkedList. You generally shouldn’t memorize this table, but rather know how the methods are implemented to figure out what the run-time is. For example for LinkedList.remove(index), we know we first have to traverse to the right index (worst case O(n)) and then change the references at that node to remove the value (worst case O(1)) for a total run-time of O(n).

Where the run-time complexities for the methods we implemented differs from Java’s for the table, we show the complexity of our implement / complexity for Java's

Operation ArrayList LinkedList
add(value) O(1)* O(n) / O(1)
add(index, value) O(n) O(n)
get(index) O(1) O(n)
indexOf(value) O(n) O(n)
contains(value) O(n) O(n)
remove(index) O(n) O(n)

* Most of the time!

Looking at this table, it seems like LinkedList is always the worse of the two structures, so why do we ever use LinkedList? In certain cases, like being treated as a Queue, it is much faster. For operations a Queue needs, we can see the LinkedList is superior (note these are just the built in Java run-times).

Operation ArrayList LinkedList
remove from front O(n) O(1)
add at end O(1)* O(1)