Due: Thursday, April 6 by 11 pm. Please
use Gradescope (gradescope.com
,
also linked from the CSE 401/M501 resources web page) to submit
your homework online. Gradescope's web site has several
instructional videos and help pages to outline the process,
and this guide has
specific information about scanning and uploading pdf files
containing assignments (although this guide is now somewhat old, so
you may have better methods of creating pdf files on your phone using
more recent software).
For questions that ask for regular expressions, you must restrict yourself
to the basic operations and abbreviations outlined in the lecture slides: concatenation,
|
, *
, +
, ?
,
and character classes [...]
.
You may also specify character classes containing
all characters except for specific characters, e.g., [^abc]
.
You can also use abbreviations and be somewhat informal when the meaning of the full regular expression
it abbreviates is clear and unambiguous,
e.g., allexceptx = a|b|c|...|w|y|z
.
Note that you cannot use something like "allexceptx" in a solution without first defining
it as a name for a particular regular expression, as in the above example.
You may not use additional "regular expression" operations found in libraries
or languages like perl, python, java, or ruby, or in unix tools like grep and
sed. In particular, you cannot use not(
re)
as
a a regular expression that matches all other regular expressions except for re.
When drawing DFAs, you do not need to include an explicit "error state" or show transitions to this state for situations where the DFA rejects its input. We will informally assume that if there is no transition defined from a state on a particular input character then it corresponds to a transition to an error state, and diagrams will be simpler if we do not include this extra state and its many, many incoming transitions.
You should do this homework individually.
a
|xy
)*b
(oz
)+o
0
)1
)*(a) Write a regular expression that generates C integer constants as described above. (Of course you can break the solution down into a regular expression with several named parts if it makes things easier to write and read.)An integer constant consisting of a sequence of digits is taken to be octal if it begins with
0
(digit zero), decimal otherwise. Octal constants do not contain the digits8
or9
. A sequence of digits preceded by0x
or0X
(digit zero) is taken to be a hexadecimal integer. The hexadecimal digits includea
orA
throughf
orF
with values 10 through 15.An integer constant may be suffixed with the letter
u
orU
, to specify that it is unsigned. It may also be suffixed by the letterl
orL
to specify that it is long.
/* ... */
. *
and /
. Also, remember that
in C comments do not nest (i.e., a */
marks the end of a comment
no matter how many times /*
appears before it.)...
between /*
and */
.
While there are many versions of this problem on the web and elsewhere,
they can differ in various subtle ways so you want to reason
carefully about the strings that make up proper comments according to this specification.
Also, as with the previous problem, you may find it helpful to alternate between the
regular expressions and DFA as you work on the problem.