CSE 401 Final Answer Key


  1. What is the purpose of generating machine-independent intermediate code before generating machine code for the target machine?


(a) Make the task of generating machine code easier

(b) Make the code generator and the optimizer portable across several processors

(c) The intermediate language has more powerful operators than the machine language


Answer: (b). The answer (a) is not valid because machine code can be just as easily be generated straight from the AST (as our compiler does). The answer (c) is not always valid, and even if it were, that would not constitute much purpose anyway.


  1. How many temporary variables can the intermediate code generator create?


(a) As many as there are registers in the target processor

(b) A number that's arbitrarily chosen, but fixed

(c) Conceptually unlimited


Answer: (c). As we discussed, new temporaries are generated liberally.


  1. What is the format of intermediate code?


(a) Tree-structured: mimics the AST of the source program because it was generated from it.

(b) Linear list of instructions

(c) Both formats (a) and (b) have advantages and disadvantages


Answer: (b). We never discussed tree-structured intermediate code.


  1. What is the most accurate definition of an l-value?


(a) An expression referring to an entity that has an address in memory.

(b) A variable name

(c) An expression computing a temporary value

(d) A function call


Answer: (a). Don't get fooled by (b); an array element reference (such as x[5]) is an l-value alright, and it's at the same time an expression and not a simple variable name.


  1. In an optimized interpreter, could the activation record for procedures be merged with the symbol table of that procedure?


(a) No

(b) Yes, if the language disallows recursive calls

(c) Always

(d) Yes, but only for procedures that don't define local variables


Answer: (b). This is a subtle one. A procedure that's not recursive will always have one activation record (like in FORTRAN). In that case, it could be merged with the symbol table.


  1. When accessing variables defined in the enclosing scope in a compiled PL/0 program, what can you say about the runtime access speed?


(a) Constant

(b) Linear in the depth of scoping; the deeper the nesting, the longer the access time

(c) Almost constant in a typical implementation because variable names are stored in a hashtable

(d) Linear in the length of the identifier name


Answer: (b). We discussed in class how accessing variables in syntactically enclosing scopes involves fetching and dereferencing the frame pointer as many times as levels we go up. This fetching takes time that increases linearly with the depth of scoping. The variable names business is a red herring because you don't care to store variable names after compilation.


  1. When calling a function in a given language, the order of pushing the arguments onto the stack (left-to-right versus right-to-left) is decided by:


(a) The complexity of the expressions that form the actual arguments

(b) Whether the function has a fixed or variable number of arguments

(c) The types of the actual arguments, e.g., arrays versus integers versus strings etc.

(d) The language and the implementor


Answer: (d) It's indeed up to the language definition and the implementation on which convention is chosen. The answer (b) was a red herring; if you implement right-to-left passing you make it easier to implement variable arguments functions, but a language could choose to implement variable arguments in a myriad of ways, many of which don't prescribe right-to-left passing.


  1. Alignment issues are caused by:


(a) Different widths of the internal and external processor bus

(b) The quirks of transferring data between cache and main memory

(c) Portions of a word can only reside at specific addresses in memory

(d) The Virtual Memory Manager (VMM)


Answer: (c). As we discussed in class, portions of the memory can only be addressed by certain address lines, which means that, for example, the least significant bit of an integer can only sit in a specific subset of addresses – such as, those divisible by 4.


  1. Under what circumstances could an implementation of the C++ programming language store a variable of type bool in a single bit?


(a) That variable is a compile-time constant, i.e., it is initialized with a constant and then never written

(b) The address of that variable is never taken

(c) That variable is next to at least another bool variable so they can be stored together

(d) That variable is defined with the register qualifier


Answer: (b). If someone takes the address of the variable, the information about the datum must fit in a standard pointer, so there's no place to store the actual bit position. The answer (d) looks plausible; you'd think you wouldn't be allowed to take the address of a register, but, as we talked durring the raffles, in C++ you can.


  1. Consider the following C++ code:

void Fun()
{
    char* p = new char[sizeof(int) + 1];
    int* pInt1 = (int*)p;

++p;

int* pInt2 = (int*)p;

*pInt1 = 1;

*pInt2 = 2;

}


(a) The code likely has an alignment issue, and it depends on the system whether a run-time error/crash will occur

(b) The code does not have an alignment issue

(c) The code has an alignment issue, and it will surely crash on any system

(d) The code has an alignment issue and a buffer overrun issue

(e) The code only has a buffer overrun issue


Answer: (a). We discussed similar code in class. The two integers are one byte off from each other, and on most systems this engenders an alignment problem. There is no buffer overrun. Whether the code crashes or works (potentially slowly) depends on how the operating system handles the alignment exception generated by the processor.


  1. In C and C++, can a type require more alignment than its own size?


(a) Yes; the typical example is float numbers. Although they might be only 4 bytes in size, they sometimes need 8 bytes of alignment due to the way some floating-point units work.

(b) No, because that would mean you can't compute the position of data in an array by multiplying the index with sizeof(element), and this is how array offsets are supposed to be computed

(c) Yes, because for example floating point numbers can internally (in the processor registers) hold more bits of precision than exposed externally

(d) No, because the type couldn't be stored properly in structures and classes


Answer: (b). Fundamentally, this is how all pointer arithmetic works in C and C++. If a type does not need all bytes but has stringent alignment requirements, the compiler will pad that type.


  1. Which matrix format storage is better, row-major or column-major?


(a) Depends on the style of iteration that's most commonly used

(b) Column major because most scientific computations scan arrays columnwise

(c) Given that both styles of iteration are common, row-major has an edge because it allows the type system to naturally implement multidimensional arrays as arrays of arrays of arrays...

(d) Given that both styles of iterations are common, column-major has an edge because it allows the type system to naturally implement multidimensional arrays as arrays of arrays of arrays...


Answer: (c). We discussed that both column-wise and row-wise iterations are commonly used in ordinary operations such as matrix multiplication. Furthermore, compounding types naturally leads to a row-major storage scheme.


  1. What disadvantages do fixed-size arrays have?


(a) Make static checking of program harder

(b) Impose bounds checking at each access in the array

(c) Make writing reusable code harder, because the length of the array is part of the type; therefore functions written for arrays of size 1024 won't simply work with arrays of other sizes


Answer: (c). This answer comes mostly by eliminating (a), which is not the case (on the contrary, the compiler has richer type information so static checking is actually easier), and (b), which is bogus: bounds checking is needed for both kinds of arrays. It could be even said that bounds checking is more stringent for dynamically-allocated arrays; for statically-sized arrays, the compiler can eliminate some bound checks if it can prove the index won't run over the statically-known size.


  1. What does “static allocation” mean for a variable?


(a) It has statically-known size and is allocated on the program stack

(b) It is allocated in the static data area of the executable program

(c) It is invisible outside the module where it is defined

(d) It is invisible outside the function where it is defined


Answer: (b). This is the definition we discussed in class. Choice (a) is bogus, and (c) and (d) are two effects of the overloading of the static keyword in C and C++. (Ok, that was a bit cunning.)


  1. Consider the following C++ function:

void Fun(int a, int b)
{
    if (a == b)
    {
        double d1 = a / 4;
    }
    double d2 = b / 4;
}

What can you say about the addresses of d1 and d2?


(a) They could be allocated at the same address, because their lifetimes are not overlapping

(b) They cannot be allocated at the same address

(c) They can be allocated at the same address because double is a primitive type. If in the code above you replace double with a class type that has constructors and destructor, then d1 and d2 cannot possibly be allocated at the same address.


Answer: (a). These variables have non-overlapping lifetimes, so the compiler could allocate them at the same address. There's no possible confusion about the identity of the value at that address. (This is not what most compilers do; most use a simple greedy algorithm that will allocate one double inside Fun's scope and one more double inside the if statement. In my tests, of three compilers – Metrowerks CodeWarrior 7.0, Microsoft Visual C++ .NET Everett Final Beta, and gcc 2.95.3-6 – only gcc saves stack space by allocating d1 and d2 at the same address.)

The issue on whether the type has cdtors or not is irrelevant.


  1. When accessing variables defined in the enclosing scope in a compiled C or C++ program, what can you say about the runtime access speed?

(a) Constant because in C and C++ there is only the global scope and the local scope

(b) Linear in the depth of scoping; the deeper the nesting, the longer the access time

(c) Almost constant in a typical implementation because variable names are stored in a hashtable

(d) Linear in the length of the identifier name


Answer: (a) In C and C++ you can't access variables in all hierarchically enclosing scopes. It's either your scope or the global scope. Again, identifier names will not be part of the compiled program.


  1. What inherent limitations imposes stack allocation on language power?


(a) Stack variables cannot have their address taken

(b) Stack variables cannot escape their context, and therefore supporting first-class functions is difficult

(c) Stack variables make pass-by-reference difficult to implement

(d) Stack variables make variable lookup rules overly complicated because of nested scopes


Answer: (b). Pretty much all other answers are fluff.


  1. In PL/0, the dynamic link is conceptually needed:


(a) Solely for knowing where to return when the function exits (the dynamic link is practically the return address)

(b) For the purpose mentioned at (a), plus to access variables defined in caller's scope

(c) Solely for accessing variables defined in caller's scope


Answer: (a). As we discussed again during the final review, PL/0 is just a syntactically scoped language; a procedure has only access to variables defined in its own scope and in the lexically-enclosing scopes. That's what the static link is used for. The dynamic link's only semantics is that of a return address.


  1. In class we discussed a non-standard C library function called alloca bearing the following prototype:

void* alloca(size_t size);

What does alloca do? Select the answer that best describes alloca.


(a) Allocates memory on the stack of the executing function. To free that memory, you need to call the pair function dealloca.

(b) Allocates memory on the stack of the caller of the executing function. The memory will last after the current function returns, but will be automatically reclaimed when the caller returns.

(c) Allocates memory on the stack of the executing function. When the executing function returns, memory will be automatically reclaimed.

(d) Allows the programmer to control local variable allocation

(e) Allows functions to return arrays by value


Answer: (c). Simply put, alloca subtracts size from your stack pointer. The return sequence will take care of restoring the stack pointer without any intervention from the programmer.


  1. In our PL/0 implementation, the static link is conceptually needed:


(a) For accessing variables defined in caller's scope

(b) For for knowing where to return when the function exits

(c) For accessing constants and variables defined in caller's scope


Answer: (a), as discussed in answer 18.


  1. What can you say about the relationship between the stack pointer and the frame pointer in a PL/0 implementation in which the stack grows downward?


(a) FP is always greater than SP

(b) FP is always smaller than SP

(c) FP is in no special relationship with SP


Answer: (a). FP always points to the current frame, which is within valid, allocated stack space.


  1. Consider the following program written in pseudo-C:

int a[] = { 1, 2, 3, 4, 5 };
int i = 0;

int Fun(int x)
{
    x += a[0];
    i += 2;
    x += a[0];
    return x;
}

int main()
{
    cout << Fun(a[i]);
}

We said pseudo-C because we can tweak the calling convention used by this language. Under which calling convention will the program print the largest number?


(a) Call-by-value

(b) Call-by-reference

(c) Call-by-name


Answer: (c) Call-by-value returns 3, call-by-reference returns 4, call-by-name returns 5.


  1. What is a closure as discussed in class?


(a) The set of variables accessible at a given point in a program

(b) The set of variables local to a given function

(c) A function together with the environment necessary for that function's execution

(d) The act of closing a deal with a used car salesman


Answer: (c).


  1. What can be said about the stack space needed to execute a function?


(a) It is statically known, unless the function uses tricks such as alloca

(b) It can't be known statically because the exact stack space needed depends on the control flow, recursion etc.

(c) It cannot be known statically because the size of the stack frame of the function is of variable size


Answer: (b). Only the stack frame size is known statically.


  1. What is the quintessential advantage of a compiler over an interpreter?


(a) Better type checking

(b) Faster execution

(c) Smaller memory requirements

(d) Adherence to binary standards


Answer: (b). An interpreter can do (a) and (d), and some interpreters beat compiled code when it comes about small size.


  1. In our PL/0 implementation, where are constants' values held?


(a) In the symbol table

(b) In the activation record

(c) In the stack frame

(d) In the AST after parsing


Answer: (a).


  1. In a compiled C++ program, is the symbol table loaded at runtime?


(a) Yes, because C++ allows definition of local classes which can define functions

(b) Only in programs that use non-local control flow, such as exceptions.

(c) No, because all variable names are not kept after compilation


Answer: (c).


  1. What is the structure of the symbol table of a program?


(a) Singly-linked list

(b) Doubly-linked list

(c) Tree identical with the AST of the program

(d) Tree with a different topology than the AST of the program

(e) Array


Answer: (d).


  1. Consider the types int and double as the archetypal integer and floating point types. If you were to establish a relationship between these types, what would you say?


(a) int could inherit double

(b) double could inherit int

(c) None of the above


Answer: (c). None of these types is substitutable for the other.


  1. Consider two types: Point (having as members two coordinates) and ColoredPoint (having as members two coordinates and a color). If you were to establish a relationship between these types, what would you say?


(a) Point could inherit ColoredPoint

(b) ColoredPoint could inherit Point

(c) None of the above

(d) Either of (a), (b)


Answer: (b).


  1. In our PL/0 implementation, what is the relationship between the hierarchy rooted in Type and that rooted in TypeAST?


(a) TypeAST is the type as parsed, and Type is the type used for typechecking purposes

(b) Type is the type as parsed, and TypeAST is the type as stored in the AST.

(c) Type inherits TypeAST

(d) TypeAST inherits Type


Answer: (a).


  1. How does PL/0 solve the dangling else problem?


(a) The if-else statement is defined in a manner that avoids the dangling else altogether

(b) The IfStmt class has a simple hack to take care of that

(c) The grammar uses an intermediate nonterminal to eliminate the ambiguity


Answer: (a). By the language definition, after the then clause, you either see end (case in which you're done) or else (case in which there's an else branch). There's no ambiguity.


  1. How is an assignment typechecked in the new (enhanced) PL/0?


(a) The left-hand-side and right-hand-side expressions must be of the same type

(b) The left-hand-side and right-hand-side expressions must be of the same type, and the left-hand-side expression must refer to an lvalue.

(c) The left-hand-side and right-hand-side expressions must be of the same type, and the left-hand-side expression must be a variable name.

(d) The right-hand-side type must be convertible to the left-hand side type, and the left-hand-side expression must refer to an lvalue.


Answer: (b). Arrays make answer (c) insufficient.


  1. What can you say about downcasts and upcasts in C++?


(a) Downcasts are always safe, upcasts are sometimes unsafe

(b) Upcasts are always safe, downcasts are sometimes unsafe

(c) Both can be unsafe

(d) Both are safe


Answer: (b). Upcasts are derived-to-base conversions and are always safe. Downcasts (base-to-derived) are ok only if the pointer or reference actually refers to what the cast claims it refers.


  1. A grammar is ambiguous when:


(a) A symbol name appears duplicated

(b) Two different textual inputs parse to the same AST

(c) The same textual input could parse to two different ASTs

(d) There are valid inputs that cannot be parsed into a valid AST


Answer: (c).


  1. In a recursive descent parser:


(a) Each terminal is a function

(b) Each nonterminal is a function

(c) Each terminal is a class

(d) Each nonterminal is a class


Answer: (b). In our compiler (d) is also true, but that's not the case in general.


  1. In an LL(k) grammar, k stands for:


(a) The maximum number of nonterminals that can build a valid production

(b) The maximum number of nonterminals that a parser must look ahead to decide which production to use

(c) The maximum number of terminals that a parser must look ahead to decide which production to use


Answer: (c), by definition.


  1. For a string µ of terminals and non-terminals, FIRST(µ) is:


(a) The set of terminals that begin strings derived from µ, together with ε, if µ can derive ε.

(b) The first k elements in µ, assuming the grammar is LL(k)

(c) The set of terminals that could appear before µ in a well-formed sentence


Answer: (a), easy to guess given that the others are utter bogus :o}.


  1. For an LL(1) grammar, the PREDICT function:


(a) Takes a terminal, returns a nonterminal

(b) Takes a nonterminal, returns a terminal

(c) Takes a nonterminal and a terminal, returns a production

(d) Takes a nonterminal and a terminal, returns a set of productions

(e) Takes a terminal, returns a production


Answer: (c). PREDICT takes a nonterminal (the sentential structure we're looking for, initially the start symbol) and a terminal (the incoming symbol), and returns the production to choose (only one because we're dealing with an LL(1) grammar).


  1. Which part of the build process does yacc automate?


(a) Scanning

(b) Parsing

(c) Semantic analysis

(d) Code generation


Answer: (b).


  1. In a compiler with a classic structure, would the code generator need to invoke the scanner?


(a) No

(b) Yes, but only when macro preprocessing is involved

(c) Always


Answer: (a). The code generator has no business invoking the scanner.


  1. Consider the following routine for scanning integers in our compiler:

Token* Scanner::GetInt()
{
    int integer = 0;

while (isdigit(CurrentCh)) { integer += CurrentCh - '0'; integer *= 10; GetCh(); } return new IntegerToken(integer); }

This routine has a bug, can you spot it?


(a) The subtraction of '0' from CurrentCh is not needed

(b) The integer must be multiplied by CurrentCh – '0' and must be added 10

(c) The multiplication by 10 must be performed before the addition


Answer: (c). All others are visibly fluff, easy to detect with a couple of simple tests.


  1. How would be class inheritance most accurately described?


(a) “Is-a”

(b) “Is substitutable for”

(c) “Contains”

(d) “Extends”


Answer: (b). A derived class must be transparently substitutable for its base class for all code that manipulates base class objects. Anything else is more vague or just wrong.


  1. When scanning a program, the scanner finding an integer will:


(a) Emit the same token for all integers

(b) Emit a distinct token for each integer

(c) Depends on context


Answer: (a). There is one INTEGER token.


  1. An “undefined identifier” error might be caught during:

(a) Scanning

(b) Parsing

(c) Semantic analysis

(d) Space allocation


Answer: (c). Scanning and parsing don't have enough power; Space allocation occurs long after the kin of each identifier is known.


  1. In C and C++, typedef introduces:


(a) A brand new type with the same behavior as the original type

(b) A type that's structurally equivalent with the original type

(c) A type that's a synonim of the original type


Answer: (c). This was a midterm question, too.


  1. When calling a function, the responsibility for clearing up the arguments off the stack after the callee has returned falls on:


(a) The caller

(b) The callee

(c) Depends on the language and implementation


Answer: (c). Traditionally, Pascal lets the callee do it, C has the caller do it.


  1. Can an implementation pass some arguments in registers (as opposed to the stack)?


(a) No, because the registers are destroyed during the issue of the call

(b) No, because that would mess up the stack frame of the function

(c) Yes, because the compiler has enough static information to manage such a convention

(d) No, because the compiler lacks information in the case of separate compilation


Answer: (c). The compiler really has everything at its fingertips (assuming it has fingers).


  1. In a program, in general variables should:


(a) Be as local as possible

(b) Be as global as possible

(c) Have a lifetime as long as possible


Answer: (a).


  1. What is the major problem with passing arguments by reference?


(a) Simple/standard functions are hard to implement

(b) Aliasing

(c) Inefficiency


Answer: (b).


Hopefully you are still in a good mood as you reached this point... Happy Holidays!