CSE 401 Winter 2013 Homework 1 - Languages, Grammars, and Regular Expressions

Due: Friday, January 18 by 11:59 pm.

Please use the dropbox linked from the CSE401 web page to submit your homework online. Any common format like pdf or doc/docx is fine, or you can submit a scanned copy of your work as long as it is legible. We suggest you show your work to help us award partial credit if appropriate, and for TA sanity.

For questions that ask for regular expressions, you should restrict yourself to the basic operations and abbreviations outlined in the slides: concatenation, |, *, +, ?, and character classes [...] or character classes containing all characters except for specific characters, e.g., [^abc]. You can also use abbreviations and be somewhat informal when the meaning is clear, e.g., allexceptx = a | b | c | ... | w | y | z. You should not use additional regular expression operators found in languages like perl, python, or ruby, or in unix tools like grep and sed. In particular, you cannot use not(re) as a a regular expression that matches all other regular expressions except for re.

Extra credit is completely optional, and should only be attempted after all other exercises are completed.

You should do this homework individually.

  1. This exercise asks you to work with the simple program grammar from Lecture 2. For reference, you can find it on slide 17 of this slide set.
    1. Expand the grammar to include subtraction, multiplication, and division. You only need to show any new or changed productons. We will assume that all of the others remain the same.
    2. Using your expanded grammar, show a derivation of a = 0; if (b) a = b*b; if (c) a = a - c*c;.
    3. Extra Credit: How would you modify your grammar to forbid division by constant 0? E.g., a = b / 0; would not be derivable. Note: you don't need to forbid division by variables that happen to be equal to zero. A grammar alone can not do that (except by completely removing 0 - and any arithmetic that could generate 0 - from the language, which is not what we are looking for here).

  2. For each of the following regular expressions, (i) give an example of two strings that can be generated by the regular expression and two that use the same alphabet but cannot be generated, and (ii) give an English description of the set of strings generated (for example, "all strings consisting of the word 'cow' followed by 1 or more 'x's and 'o's in any order"), not just a transliteration of the regular expression operations into English.
    1. (oink )*oink
    2. 1+0*1*

  3. This exercise deals with the regular expression 10*.
    1. Describe the strings generated by this RE.
    2. Construct an NFA that recognizes this RE.
    3. Convert your NFA to a DFA.

  4. Give regular expressions that generate the following sets of strings.
    1. All decimal numbers with either 1 or 2 digits after the decimal point. You should not allow extra leading 0s (e.g., 0.1 is okay, but 05.23 is not).
    2. All binary strings with at least three 1s.
    3. All strings of one or more 3-letter "words" separated by spaces (where a "word" is defined as any sequence of lower case letters).  

  5. A comment in C is a sequence of characters /* ... */. Write a set of regular expressions that generate C-style comments. You can restrict the alphabet to lower-case letters, digits, spaces, newlines (\n), carriage returns (\r) and the characters * and /. Also, remember that in C comments do not nest (i.e., a */ marks the end of a comment no matter how many times /* appears before it.)

    Hint: be careful about what is included in the ... between /* and */.