Java is a general purpose high-level programming language, initially
developed to address the problems of building software for networked
consumer devices. It was designed to support multiple host architectures,
in such a way that compiled Java code could be transported over a network,
from one machine to another. It is the ease with which compiled Java code
can be transferred over a network that has made Java a particularily
attractive programming language for Web browsers.
The Java Virtual Machine (JVM) is the cornerstone of the Java programming
language. It is the component of Java technology that is responsible for
the cross-platform delivery, and for the small size of compiled Java code.
The task of a Java compiler is to convert the high-level Java language into
JVM bytecodes, and these bytecodes are the compiled form of
the language that is then transferred over the network.
The JVM is an abstract computing machine. Like a real computing machine, it
has an instruction set and uses various regions of memory. However, the JVM
does not assume any particular implementation technology or host platform.
The particular implementation technology that we will investigate in this
assignment is direct interpretation (simulation). Just as SPIM provides an
interpreter for the MIPS instruction set, we will build an interpreter that
processes the Java Virtual Machine bytecodes (note that
bytecode is just the JVM word for instruction).
In order to make this assignment a reasonable size, we present a subset of
the JVM to you, since building an interpreter for the entire JVM is too
large a task. The following table presents a summary of the portion of the
JVM instruction set that your interpreter will need to handle.
Instruction |
Bytecode |
Byte Format |
Stack Ops |
Description |
IADD ISUB IMUL ILOAD ISTORE BIPUSH SIPUSH
POP IFEQ IFNE IFLT IFLE IFGT IFGE
GOTO IRETURN
|
0x60 0x64 0x68 0x15 0x36 0x10 0x17
0x57 0x99 0x9a 0x9b 0x9e 0x9d 0x9c
0xa7 0xac
|
IADD ISUB IMUL ILOAD index
ISTORE index BIPUSH immed
SIPUSH immed1 immed2
POP
IFEQ offset1 offset2
IFNE offset1 offset2
IFLT offset1 offset2
IFLE offset1 offset2
IFGT offset1 offset2
IFGE offset1 offset2
GOTO offset1 offset2
IRETURN
|
POP v1, POP v2, PUSH result
POP v1, POP v2, PUSH result
POP v1, POP v2, PUSH result
PUSH result
POP v1
PUSH result
PUSH result
POP v1
POP v1
POP v1
POP v1
POP v1
POP v1
POP v1
none
POP v1
|
Integer add
Integer subtract
Integer multiply
Load from local variable at index
Store to local variable at index
Byte immediate push
Short immediate push
Pop value, discard
Branch on v1 == 0
Branch v1 != 0
Branch on v1 < 0
Branch on v1 <= 0
Branch on v1 > 0
Branch on v1 >= 0
Unconditional branch
Integer return
|
Storage
The JVM is a stack machine. This means that instructions implicitly
use a stack to fetch their operands and to store their results.
The Stack Ops column in the above table describes the
implicit stack operations performed by each of the above instructions.
In addition to the stack, the JVM also provides local variable
storage. The local variable storage is accessed by the ILOAD and
ISTORE instructions described below.
Instruction Details
The arithmetic instructions (IADD,ISUB,IMUL) all operate on
signed 32-bit integers, but none of them should generate
overflow exceptions when overflow occurs. For ISUB, the result
is (v2 - v1), whereas for IADD and IMUL the operand order
does not matter.
The local variable instructions (ILOAD,ISTORE) are each followed by an
8-bit unsigned index value which determines the location within the
local variable array that should be read or written. You may assume
that there are no more than 256 local variables, and you do not have
to check that the index is valid (although a real JVM interpreter would
have to perform this check). Each entry in the local variable array
is a 32-bit word.
The BIPUSH instruction and SIPUSH instructions are followed by either
an 8-bit or a 16-bit signed immediate value, which is converted to
an integer on the stack. For SIPUSH, the immediate is calculated
as ( immed1 << 8 ) OR immed2 . For both BIPUSH and SIPUSH,
the immediate value is then sign extended to a 32-bit integer, and
then pushed on to the stack. The 32-bit result of the POP instruction
is simply discarded.
The branch instructions (GOTO,IFNE,IFEQ,IFLE,IFLT,IFGE,IFGT) are
followed by two bytes which specify a signed offset relative to
the start address of the branch instruction. The offset bytes
are used to construct the 16-bit offset by the following calculation:
( offset1 << 8 ) OR offset2 . All of the comparison operations
for the above conditional branch instructions are signed comparisons.
The IRETURN instruction signals the end of the computation (our
simplified model of the JVM only handles interpreting a single
Java method). The 32-bit value at the top of the stack is popped and
then used as the return value.
What we provide for you:
- We provide the file javamain.s which contains the main routine
that you will use to invoke your interpreter. You'll need to copy
this file from /cse/courses/cse378/97au/lib/spim.
- Our javamain.s program allocates the array where the program
is stored, as well as the storage for the java operand stack and
the storage for the local variable array. You can assume that
the JVM test programs will not exceed the resources that we've allocated
for you.
- Within javamain.s, we provide a sample test program for your
interpreter. When you create new JVM test programs, you'll need
to edit the javamain.s file, and you should use our sample
test program as a model. Also, if you want to rename the "java_interpret"
routine to have a different name, you'll need to edit javamain.s.
The interface between "main" and "java_interpret" is detailed in the
javamain.s file.
What you need to do:
- Build your JVM interpreter (an implementation of the "java_interpret"
routine) which handles all of the instructions
listed in the above table. The basic structure of your interpreter
should be a loop that looks something like the following:
- fetch the next bytecode
- decode the operation
- fetch the operands
- execute the operation
- store the results
- Implement your interpreter using MIPS subroutines, and follow
the MIPS calling convention that we covered in section. This
means that you should not implement your solution as one monstrous
subroutine. If you do, your solution will be more difficult for me
to read, and the total amount of code you have to write will be
more than necessary.
- Build some simple JVM programs to test your interpreter. You'll
have to turn in your test programs. Make sure that every instruction
listed above is used in at least one of your test programs. Don't
worry about making your tests compute anything useful (except for
the next item).
- Build an implementation of a factorial subroutine in JVM bytecodes,
and use this to test your interpreter. You should place N in local
variable 0 (which is where the caller would have placed it if our
interpreter supported method invocation), so the first 2 instructions
of your factorial routine should be:
BIPUSH N
ISTORE 0
where N is the integer argument to the factorial subroutine.
A Suggestion
Because this assignment is more substantial that the previous ones,
I suggest you get started early. A good goal for the end of the first
week would be to have the arithmetic and stack instructions working,
as well as having written out your JVM implementation of the factorial
subroutine.
Addendum
The following items arose during the assignment. These are things
that should be made clear if the assignment is used in the future.
Your interpreter needs to handle the following error conditions
gracefully:
- Stack underflow: when a bytecode needs to get its
arguments from the operand stack, but the operand stack does
not contain enough operands to perform the operation, then the
interpreter should print an error message and terminate the computation.
The return value from the java_interpret routine in this
case should be -1.
- Illegal Operation: when the operand decoded by the interpreter is
not one of the ones listed in the table above, then the interpreter
should print an error message and terminate the computation. The
return value from the java_interpret routine in this
case should be -1.