Home Manipulating data structures
Using copy constructors
You may use the copy constructor for Sets and Maps. If you want to use the copy constructor for any other kind of data structure, you should ask permission first.
The copy constructor is a feature that many data structures implement that let you create a copy of some other iterable data structure. For example, if we wanted to create a copy of a Set, we could do something like this:
Set<String> copy = new HashSet<String>(); for (String item : originalSet) { copy.add(item); }
...or more concisely, use the copy constructor like so:
Set<String> copy = new HashSet<String>(originalSet);
This 1-argument constructor is known as the copy constructor, and will internally do something very similar to what we did in our first example up above.
Both versions are considered acceptable, since they both do essentially the same thing. Using the copy constructor (in cases where you are allowed to use them) would be slightly preferable, since it leads to shorter code.
Manipulating strings
When attempting to manipulate or "modify" a String, you should try and design your code such that your code "builds up" a new String rather then trying to "modify" an existing String. In particular, you should remember that Strings are immutable (unchanging) so trying to "modify" a String is not something you can really do.
When working with Strings, it's important to remember that Strings are immutable in Java – this means it's impossible to modify a String in any way. If you recall, all string methods are designed to return a new String or some other type in some way. There is, however, no way to "set" or "change" a character.
Some students seem to ignore this fact, and end up trying to manipulate Strings in the same way they manipulate arrays, leading to hacky code. For example, suppose we were asked to write a method where we take in a String and wanted to produce a new String where all numeric characters were replaced with the character 'X'.
Naively, we might perhaps try approaching this problem as if it were like an arrays problem:
// Warning: this doesn't compile! public static String vowelsToUppercase(String input) { for (int i = 0; i < input.length(); i++) { char ch = input.charAt(i); if (Character.isDigit(ch)) { input.setChar(i, 'X'); // Problem! String doesn't have a 'setChar' method } } return input; }
However, Strings are not modifiable, so we might hack around this by trying to change the original String like this:
public static String vowelsToUppercase(String input) { for (int i = 0; i < input.length(); i++) { char ch = input.charAt(i); if (Character.isDigit(ch)) { // Try and 'modify' the String String before = input.substring(0, i); String after = input.substring(i + 1, input.length()); input = before + 'X' + after; } } return input; }
While this works, this code is ugly – the code needed to modify the original
String is complex and hacky, and requires us to do a lot of unnecessary calls to the
substring(...)
method.
Instead, what we should do is to stop fighting against Java and instead build up a new, clean String from scratch:
public static String vowelsToUppercase(String input) { String output = ""; for (int i = 0; i < input.length(); i++) { char ch = input.charAt(i); if (Character.isDigit(ch)) { output += 'X'; } else { output += ch; } } return output; }
This code is much cleaner and easier to reason about (and is just as efficient).
Do not leak internal state
You should never return a value in such a way that a client is able to use that data to modify a private field.
Or, to rephrase, you should never return a reference to a field referring to some mutable data structure.
When designing a class, it's often the case that we want to write public methods that allow a client to obtain some information about the object. However, it's important that we implement those methods in such a way that we do not accidentally leak internal state.
This concept is best explained via an example.
Suppose we were trying to design a class that is similar to an ArrayList, except that it can only ever hold odd numbers. We might design our class like so:
public class OddNumberList { private List<Integer> numbers; public OddNumberList() { this.numbers = new ArrayList<Integer>(); } public void add(int num) { if (n % 2 == 0) { throw new IllegalArgumentException( "number cannot be even!"); } this.numbers.add(num); } public int get(int index) { return this.numbers.get(index); } // Other methods }
It does essentially the same thing as a regular ArrayList, except that all of its methods will only ever allow you to pass in numbers that are odd. (Yes, this is a very contrived example).
Now, let's suppose we wanted to implement a method that returns all of these numbers,
except in a List. (We could of course use the OddNumberList
itself if we
made it implement the List
interface, so this goal is also very contrived,
but whatever). One way to implement this method would be to do something like this:
public List<Integer> asList() { return this.numbers; }
This appears to work – we return a List, just as promised. However, we've suddenly run into a potential problem. If we're returning a reference to an internal field, then that means that the client could dereference and directly use that field. This leads to the following bug:
OddArrayList odd = new OddArrayList(); odd.add(3); odd.add(5); List<Integer> numbers = odd.asList(); numbers.add(2); System.out.println(odd.toString()); // [3, 5, 2]
We've suddenly circumvented our primary constraint (all numbers must be odd) and added an even number to our class! We want to allow the client to view this sort of information, but not if it means that the client could potentially screw up our entire class. After all, the point of doing object-oriented programming is so that we can safely encapsulate and aggregate related data and methods into a single unit -- we don't want people mucking around with our innards. It's not hygienic.
Instead what we want to do is return a complete copy of the list, a copy that is completely independent from the field. In order to do this, we need to return a copy of the field, rather then return a reference to the field itself.
So, to fix our code, we'd need to do something like this:
public List<Integer> asList() { List<Integer> newList = new ArrayList<Integer>(); for (int num : this.numbers) { newList.add(num); } return newList }
This ends up fixing our client code – if we try running the code again, we'll end up
printing out [3, 5]
instead of [3, 5, 2]
Now, it's also worth recognizing when we don't need to make a copy.
We do not need to make a copy when we are:
- Returning a primitive value (since primitive values will be copied by Java for us)
- Returning a reference to an immutable object such as Strings (since Strings are immutable, there's no way for the client to change them, so there's no problem with us and the client sharing a reference to the same String).