In class, we talked about the idea of the binary search algorithm, but did not end up implementing it. This reading will look at the implementation for the algorithm since it’s not uncommon to be asked to implement the code for this algorithm on an interview for a job involving programming. The end of the reading is a reference of the complexities of the operations we have learned so far

## Binary Search¶

### Review: Algorithm Idea¶

In class, we discussed how to search a sorted array for a value. The naive algorithm that just starts at the beginning and traverses to the end or until it finds the element. With the language we discussed in class, we would say that worst case this “linear search” would take `O(n)`

time where `n`

is the number of elements in the array.

We then discussed a better idea which divides the array in half repeatedly. Start by looking at the middle element of the array and compare it to the target value, there are three options

```
search(20)
[2, 7, 9, 9, 13, 14, 16, 20, 27, 36, 51]
|--------------| |-----------------|
less than mid ^ greater than mid
mid
```

- The middle element is the same as the target: you can return the index you found
- The middle element is larger than the target: you can eliminate the right half of the array because the array is sorted and all elements to the right of the middle will be larger than the target
- The middle element is smaller than the target: you can eliminate the left half of the array because the array is sorted and all elements to the left of the middle will be smaller than the target

In the latter two cases, we now only have half the array to search and can repeat the same process using the start and endpoint of the remaining half as the start and end of the range to search. We repeat this process until we find the target value at the mid-point or once the search range has no elements left.

### Implementation¶

```
public static int binarySearch(int[] arr, int target) {
// Start looking at whole array
int low = 0;
int high = arr.length - 1;
// while low and high haven't met yet, there are still more indices to search
while (low <= high) {
// compute middle index and value
int mid = low + (high - low) / 2;
int midValue = arr[mid];
if (midValue < target) {
low = mid + 1;
} else if (midValue > target) {
high = mid - 1;
} else {
return mid; // value found!
}
}
return -1; // value not found
}
```

Java’s implementation of `binarySearch`

has the added benefit of not returning `-1`

if the value is not found, but rather indicates where the element “should go” if inserted. To do this it returns the negative of the index that the value should be inserted at.

```
return -(low + 1);
```

#### Trace of the algorithm running¶

Consider the example above

```
search(20)
values [2, 7, 9, 9, 13, 14, 16, 20, 27, 36, 51]
indices 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
```

The following is a trace of all the decisions made by the algorithm

```
--------- Iteration 1 ---------
low = 0, high = 10
Is low <= high? Yes
mid = 5
midValue = 14
Is midValue < target? Yes
low = 6
--------- Iteration 2 ---------
low = 6, high = 10
Is low <= high? Yes
mid = 8
midValue = 27
Is midValue < target? No
Is midValue > target? Yes
high = 7
--------- Iteration 3 ---------
low = 6, high = 7
Is low <= high? Yes
mid = 6
midValue = 16
Is midValue < target? Yes
low = 7
--------- Iteration 4 ---------
low = 7, high = 7
Is low <= high? Yes
mid = 7
midValue = 20
Is midValue < target> No
Is midValue > target? No
Else, return 7
```

Which gets us the value we wanted. Another way to see this is to visualize the ranges being considered in the visualization of the same trace below

### Complexity¶

Unlike the naive solution that searches every index, the binary search algorithm is much faster because it eliminates half the array on each iteration. As mentioned in class, this makes the complexity of the algorithm `O(log(n))`

.

## Data Structure Complexity¶

This section reviews the discussion of the complexities of the operations on `ArrayList`

and `LinkedList`

. You generally shouldn’t memorize this table, but rather know how the methods are implemented to figure out what the run-time is. For example for `LinkedList.remove(index)`

, we know we first have to traverse to the right index (worst case `O(n)`

) and then change the references at that node to remove the value (worst case `O(1)`

) for a total run-time of `O(n)`

.

Where the run-time complexities for the methods we implemented differs from Java’s for the table, we show the `complexity of our implement / complexity for Java's`

Operation | ArrayList | LinkedList |
---|---|---|

add(value) | O(1)* | O(n) / O(1) |

add(index, value) | O(n) | O(n) |

get(index) | O(1) | O(n) |

indexOf(value) | O(n) | O(n) |

contains(value) | O(n) | O(n) |

remove(index) | O(n) | O(n) |

** Most of the time!*

Looking at this table, it seems like `LinkedList`

is always the worse of the two structures, so why do we ever use `LinkedList`

? In certain cases, like being treated as a `Queue`

, it is much faster. For operations a `Queue`

needs, we can see the `LinkedList`

is superior (note these are just the built in Java run-times).

Operation | ArrayList | LinkedList |
---|---|---|

remove from front | O(n) | O(1) |

add at end | O(1)* | O(1) |