CSE 531 Assignment 3: pitfalls, and hopefully helpful things

I have a number of comments related to assignment 3 on various misperceptions that a significant number of people seem to have. Please read through (or skim those that look clear), since I think it's important to nip these in the bud now, and (warning) the grading will in future reflect that...

Policy

But first, a policy: any theorem proven or claimed in the textbook or in lecture is something whose result you can use directly without proof. However, please describe what the theorem says to make it easy for me to know what you're talking about.

Directions of reductions

For question 1 in particular, a number of people did the reduction in the wrong direction. A larger group of people did the reduction in the correct direction, but got the notation wrong. We're going to be doing lots of reductions in the class, so I think it's really important to get this straight. Especially since, if you do the reduction the wrong way, your proof becomes completely meaningless (I only took a small amount off on this assignment, since I know it's an easy mistake to make - but I hope to make it a difficult mistake to make in future...).

To that end, here's some tricks that I hope will help you to remember this. First, the actual reduction direction:

Rote memorization. If X is a problem you know to be hard (e.g. undecidable) and Y is a problem you want to prove is at least as hard as X then you must show how to take instances of problem X and solve them in terms of instances of problem Y.
Suppose A_{TM} is a problem that's known to be undecidable, and you want to prove that EMPTY_{TM} is undecidable using mapping reducibility. Then you need to be able to take an instance of A_{TM} and create an equivalent instance of EMPTY_{TM}.
Using Turing reducibility, you need to make a TM that takes instances of A_{TM} and solves it by (possibly more than once) asking an oracle to solve instances of EMPTY_{TM}, which the oracle will do in finite time.
Thinking about it as a programming problem. Returning to our A_{TM} and EMPTY_{TM} example, suppose someone gave you a subroutine in your favorite language that decides EMPTY_{TM}. Of course you don't get source code, because you'd have no idea how someone could do this (well unless you can show the alg). Using the subroutine, you could clearly write a program that decides A_{TM}. But, of course A_{TM} is undecidable, so you couldn't have been given that subroutine.
Notice how if someone gave you a subroutine for A_{TM}, it would help you to demonstrate that EMPTY_{TM} is at least as hard as A_{TM}.
An extreme example. It's kind of confusing to think of all of these problems that are all hard. So, let's consider A_{TM} and the EVEN problem. EVEN={w | w is an even integer}. Clearly EVEN is decidable; in fact, it's linear time.
Now, suppose I told you that I could take instances of EVEN and produce equivalent instances of A_{TM}. Well, duh. All I do is solve EVEN, and if the number's even I return a TM that accepts everything, and otherwise I return a TM that rejects everything (or just immediately goes into a while(1) loop). This says nothing about A_{TM} (well, technically it requires that A_{TM} is neither the empty nor the full language, but it says nothing about it being undecidable).
Now, suppose I told you that I could take instances of A_{TM} and produce equivalent instances of EVEN. Well, that would be really impressive because we actually know how to solve EVEN. So, this would mean that A_{TM} _is_ decidable. Alternately, if we assume that A_{TM} is not decidable, it would mean that EVEN is actually much harder than we thought (ouch - that's got to hurt for the architecture folks).

Okay, now for notation I don't have any great thought experiments. But, let me just emphasize two things:

the words "from" and "to". Think of the word "mapping". For a function F: R->Z we say that this maps R _to_ Z. So, if we had to fit the word "from" in, we'd have to say it maps _from_ R _to_ Z. Same with reductions. To show that EVEN is a hard problem, we reduce _from_ A_{TM} _to_ EVEN.
For the mapping reducibility notation, to show EVEN is a hard problem, we write: A_{TM} <=_m EVEN. The intuition on the <= direction is that we're saying that EVEN is at least as hard as A_{TM}, so in a sense it's greater or equal to A_{TM}. I believe that all of the intuition about =, < and > over numbers applies to =_m, <_m and >_m over hardness of problems.

Phew, hope that helps...

Validation of input is necessary

This has come up in proofs for problem #1, as well as some for problem #2. Let's take problem #1, and assume you use computational histories. In other words, you're mapping from A_{TM}. Given an A_{TM} instance, create a 2-DFA. One of the issues is that you'll need to require that your 2-D computational histories have symbols (like #'s) on their left side, so that you don't fall off the rectangle. However, your 2-DFA _must_ verify that its input obeys this; it has to make sure that it's input really has those # marks.

Why? Consider the Turing machine (or algorithm) that computed the mapping from A_{TM} to EMPTY_{2-DFA}. Once it makes the 2-DFA, its job is done. It's made an instance of the EMPTY_{2-DFA} problem, and now it's all up to the 2-DFA to either have an empty or non-empty language. The Turing machine can't fudge things later by ensuring that the 2-DFA doesn't get input it can't deal with (cheating!). And there's nothing that's validating the (alleged) computational histories to make sure they conform to the format you picked... Nothing, that is, except for the 2-DFA itself. If the 2-DFA accepts malformed histories, then that's it - its language is non-empty.
(note: it is true that, in this case, the 2-DFA will presumably loop forever if the # mark isn't there, and therefore it won't actually accept. So I think this would work. But you have to point out that the 2-DFA is doing the right thing - that really, it *is* validating its input, but just not explicitly)

Similarly, for those who did a reduction from 2-headed FA, the 2-DFA *must* ensure that the columns and rows are consistent (based on the columns and rows each representing the two heads). Otherwise, nothing else is going to do the 2-DFA a favor and validate its input for it. And, in this case, I have no idea what the machines would actually do.

A final example is for the intersecting CFGs in problem #2. Some people felt that they could write out a CFG, and then a Turing machine testing for intersection of the CFGs would be able to apply the appropriate semantics of the dominoes (e.g. ab/a + a/ba = aba/aba). Again, this is invalid. Once your reduction algorithm says what the two CFGs are, its job is done. It's now all up to the CFGs - unaided - to either share a string or not share a string.

Now I'd like to show why you're not allowed to do this kind of thing from another angle. Suppose I could. I will now show (1) that the _acceptance_ problem for regular, vanilla DFAs is undecidable and (2) that the intersection problem for _regular_ grammars is undecidable. Of course, both proofs are wrong (and both languages are decidable). Both proofs are intentionally silly, to show what would happen if you took these things to their logical extreme.

(1) A_{DFA} is undecidable. Proof using computational histories.

The representation of a computational history is the usual, as in lecture, however only valid computational histories will be generated. In our mapping algorithm, we'll never produce computational histories where c_i+1 doesn't follow c_i. Therefore it remains for our DFA to verify that the computational history is of the form #q_0*q_a*#. This is a regular expression, so clearly a DFA can do it. QED.

The fallacy here, of course, is that the mapping algorithm doesn't _get_ to decide what strings the DFA sees. Similarly, there's no "input formatter" that helps the DFA out either.

(2) INTERSECT_{REG} is undecidable. Reduction from PCP.

For a quicker "proof" here, I'll just give an example. Suppose we have two dominoes: ab/a and a/ba. Then G_1=(ab/a | a/ba)(ab/a | a/ba)*, so it represents the dominoes. G_2=(a/a | b/b)*, so it contains all matching tops & bottoms.

Now we need to see if the regular grammars contain a common string. To do this, we simply generate all possible strings that could come out of the regular expression, using nondeterminism. Once we've generated a particular string out of G_1 and one for G_2, we compare them.

For example, one string we'll generate for G_1 is ab/a a/ba. A string from G_2 is a/a b/b a/a. Clearly these strings have equivalent tops and bottoms, and so the grammars have a string in common. We also note that this obviously correspond to a solution to the PCP problem.

...and a more formal iff argument would follow - if this proof weren't flawed.

The flaw in the argument is, once again, that there's no TMs that are checking the strings - and doing extra computations that regexs can't. Once you make the regular grammars G_1 and G_2, it's all up to them to be intersecting or not. No more TMs are allowed to help them, and no TMs are allowed to only generate strings that they can deal with.

Another tricky issue I hope I didn't trip you up too much on: in the regular grammar above ab/a, a/ba, a/a and b/b are _atomic_ symbols. Regular grammars have no notion of symbols having tops and bottoms, of course. This is why I had to have a TM doing funky things in order to make it work - regexes aren't really that powerful.

You're in control of computational histories

It's totally legitimate to define whatever format of computational history you want. You're not restricted to the ones from lecture or from our textbook. The only requirements are:

The computational history must be able to represent any actual computational history of a TM AND
Your computational model that's checking them has to check the format itself (as above).

In fact, the logic of proofs using computational histories apply to other computational models than Turing machines. For example, we've already proven that the SMM model can do precisely the same things as Turing machines can. So, you could (if you were masochistic) define a computational history syntax for SMMs, and use that for your proof.

Practical issue: of course, using SMM computational histories is likely to lead to a really messy proof. Also, it's probably a good idea to stick with the computational histories like in lecture or in the book, because then you don't have to prove their generality or explain as much.