Java Runtime Array Type Checks

In lecture we discussed, among other things, the difference between static and dynamic type checks. Here's one classic example of this that holds true for Java and many other languages.

Consider the following Java code (also given in this file, which you can run).

String[] a = new String[1];
Object[] b = a;
b[0] = new Integer(1);
Will this work? It certainly shouldn't, since at runtime after the third line, a[0] and b[0] point to the same Integer object, but a[0] thinks it's a String. If this code succeeded, we could cause bad behavior by, for example, treating the Integer as a String and calling methods on it.

Indeed, Java doesn't allow this behavior. But where exactly will this code fail? The first line is clearly fine, so it must be either the second or the third. The second line seems reasonable. After all, String is a subclass of Object, so just like you can assign a String variable into an Object variable, it makes sense to assign an array of Strings into an array of Objects. Since the Java compiler allows this kind of subtyping, the code will compile. But it will have to raise an error on the third line at runtime. It throws a ArrayStoreException.

Java thus has to check all writes into arrays (e.g. expressions of the form a[i] = b). The check has to occur at runtime since the compiler might not know the dynamic type of b. It has to check the dynamic type of both a and b and ensure that the assignment is valid. Specifically, comparing the runtime type of b against the static type of a would not suffice, as it would allow the code above.

Exploiting a Buggy Virtual Machine (not covered in section, but worth reading about)

We saw above that Java virtual machines (and, indeed, any language that allows assigning an array of Bs into an array of As when B is a subtype of A) need to do runtime checks of the dynamic types for all array writes. But what would happen if they didn't? This is a subtle point, so it's easy to imagine a new language designer missing it (or perhaps writing a buggy VM). We could likely cause weird behavior or crash the VM, but could we do worse?

It turns out that we could actually use the lack of such a runtime check to read private information and in general hijack the machine. Consider running the following code (also here) in a Java without the runtime array type checks described above.

class A { int f; } 
class B extends A { int g; }
class C extends A { A g; }
    
void evil() {
  C[] c = new C[1];
  A[] a = c;
  B b = new B();
  b.g = 1000;  // Or the address of some arbitrary memory location
  a[0] = b;
  // Now c[0] == a[0] == b.
  // So b.g (an int) == c[0].g (a pointer)
  System.out.println(c[0].g.f);  // This reads the memory location (modulo Java's object layout)
}
This code does essentially the same thing as the previous one. Draw it out and trace through it if you don't fully understand it. After the penultimate line of code, c[0] and b point to the same location, but the former is statically typed as a C and the latter as a B. Both have a g pointer at the same location within the object, one of which is an int and the other an A. We can thus write to the int and then treat it like an A (i.e. as a pointer) to dereference it and read the value of an arbitrary location in memory, which could store something secret like a password. We could easy write arbitrary values to this memory location, which would let us take control of the code.

Luckily, Java doesn't allow this: it throws a ArrayStoreException on the evil line. The write fails, so even if we put the evil line inside of a try and then catch the ArrayStoreException, it throws a NullPointerException on the last line. So assuming a correct VM, we're safe from security problems if we use Java. Right?

Exploiting a Correct Virtual Machine (not covered in section, but worth reading about)

As it turns out, even a perfectly correct language, type system, and virtual machine can still be vulnerable to the above attacks. Computer security is hard.

The attack I will describe now was described in a paper by Govindavajhala and Appel. I will only give a brief overview of it; you can read (or skim) the paper for more details (or just read the Slashdot comments).

What if a bit of memory flips? It would be very hard to detect this with software, and it could happen due to cosmic rays. Now, the probability of this is extremely low, but it can be artificially increased. Figure 3 of the paper shows a desktop without a cover with a lamp shining right at the RAM. With this, the researchers were able to easily cause memory failures.

So such an attack isn't as unbelievable as it might seem. After all, once you have physical access to a machine, doing something like this would be pretty easy. So assume that we have the power to flip bits in memory. Unfortunately (for the attacker), we can only flip random bits; we can't control which bits will flip.

Ignoring this issue for the moment, say we can flip bits so that we have two pointers of different types that point to the same memory location (i.e. A a; B b; a == b). With this, we can exploit things as before:

class A { int i; }
class B { A a; }
// Assume A a == B b
a.i == address;
b.a.i = value;  // Writes value to memory location address (modulo Java object layout)

How do we set things up so that we get the two pointers equal? We can actually write the two classes like the following

class A {
  A a;
  B b;
  int i;
}
class B {
  A a1;
  A a2;
  A a3;
}
We can then run a program that creates one A object and a whole lot of B objects. If we then put a lamp next to the RAM and cause an error, chances are that it will be in a B object. If so, chances are that it will be in one of its fields and that the flipped bit will cause the pointer to actually point to a B object (since the heap is full of Bs) (note that the paper is more scientific about this step). The program can then run an infinite loop that checks all pointers to see if we are in the situation above and executes the attack if we are. The authors were able to achieve approximately 70% success rates with this approach.

How can we defend against such an attack? The best way is to use RAM with error-correcting code, which is special hardware.

And the lesson? Designing a correct language and implementation is hard, and getting true security is far harder.

Unsoundness Unnoticed for Years

Problems with the type system can be less obvious. To this day, people are finding new ways to prove Java unsound. Consider the following example (available here) from Ross Tate just this February:

    class Unsound {
      static class Bound<A, B extends A> {}
      static class Bind<A> {
        <B extends A> A bad(Bound<A,B> bound, B b) {return b;}
      }
      public static <T,U> U coerce(T t) {
        Bound<U,? super T> bound = null;
        Bind<U> bind = new Bind<U>();
        return bind.bad(bound, t);
      }
    }
    

This compiles without any issues. If instead of declaring bound as null, we write Bound<U,? super T> bound = new Bound<>();, Java complains:

    error: incompatible types: cannot infer type arguments for Bound<>
    reason: inference variable A has incompatible bounds
      equality constraints: U
      lower bounds: T,B
    where A,U,T,B are type-variables:
      A extends Object declared in class Bound
      U extends Object declared in method <T,U>coerce(T)
      T extends Object declared in method <T,U>coerce(T)
      B extends A declared in class Bound
    

That is, we need T to be a subclass of U. Similarly, we if try to write Bound<U,? super T> bound = new Bound<U, T>();, Java also complains:

    error: type argument T is not within bounds of type-variable B
    
    where T,U,B,A are type-variables:
      T extends Object declared in method <T,U>coerce(T)
      U extends Object declared in method <T,U>coerce(T)
      B extends A declared in class Bound
      A extends Object declared in class Bound
    

But since we declare bound as null, the compiler doesn't bother trying to figure out T and U. This is bad, because at runtime, in order for us to be able to call coerce, we need the type of the argument T to be a subclass of the type of the return value U.

The following code compiles fine:

    public static void main(String[] args) {
      String foo = coerce(1);
      System.out.println(foo);
    }
    

At runtime, however, it fails:

    Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
	    at Unsound.main(Unsound.java:18)
    

This is less obvious than array subtyping, both for language designers (clearly, as it went unnoticed for years) and for programmers (the code at first glance might look like it should work, but will mysteriously fail at runtime for many types). Who knows what other unsoundness remains to be uncovered!