To that end, here's some tricks that I hope will help you to remember this. First, the actual reduction direction:
Suppose A_{TM} is a problem that's known to be undecidable, and you want to prove that EMPTY_{TM} is undecidable using mapping reducibility. Then you need to be able to take an instance of A_{TM} and create an equivalent instance of EMPTY_{TM}.
Using Turing reducibility, you need to make a TM that takes instances of A_{TM} and solves it by (possibly more than once) asking an oracle to solve instances of EMPTY_{TM}, which the oracle will do in finite time.
Notice how if someone gave you a subroutine for A_{TM}, it would help you to demonstrate that EMPTY_{TM} is at least as hard as A_{TM}.
Now, suppose I told you that I could take instances of EVEN and produce equivalent instances of A_{TM}. Well, duh. All I do is solve EVEN, and if the number's even I return a TM that accepts everything, and otherwise I return a TM that rejects everything (or just immediately goes into a while(1) loop). This says nothing about A_{TM} (well, technically it requires that A_{TM} is neither the empty nor the full language, but it says nothing about it being undecidable).
Now, suppose I told you that I could take instances of A_{TM} and produce equivalent instances of EVEN. Well, that would be really impressive because we actually know how to solve EVEN. So, this would mean that A_{TM} _is_ decidable. Alternately, if we assume that A_{TM} is not decidable, it would mean that EVEN is actually much harder than we thought (ouch - that's got to hurt for the architecture folks).
Phew, hope that helps...
Why? Consider the Turing machine (or algorithm) that computed the mapping from
A_{TM} to EMPTY_{2-DFA}. Once it makes the 2-DFA, its job is done. It's made
an instance of the EMPTY_{2-DFA} problem, and now it's all up to the 2-DFA to
either have an empty or non-empty language. The Turing machine can't fudge things
later by ensuring that the 2-DFA doesn't get input it can't deal with (cheating!).
And there's nothing that's validating the (alleged) computational histories to
make sure they conform to the format you picked... Nothing, that is, except for
the 2-DFA itself. If the 2-DFA accepts malformed histories, then that's it - its
language is non-empty.
(note: it is true that, in this case, the 2-DFA will presumably loop forever if
the # mark isn't there, and therefore it won't actually accept. So I think this
would work. But you have to point out that the 2-DFA is doing the right thing -
that really, it *is* validating its input, but just not explicitly)
Similarly, for those who did a reduction from 2-headed FA, the 2-DFA *must* ensure that the columns and rows are consistent (based on the columns and rows each representing the two heads). Otherwise, nothing else is going to do the 2-DFA a favor and validate its input for it. And, in this case, I have no idea what the machines would actually do.
A final example is for the intersecting CFGs in problem #2. Some people felt that they could write out a CFG, and then a Turing machine testing for intersection of the CFGs would be able to apply the appropriate semantics of the dominoes (e.g. ab/a + a/ba = aba/aba). Again, this is invalid. Once your reduction algorithm says what the two CFGs are, its job is done. It's now all up to the CFGs - unaided - to either share a string or not share a string.
Now I'd like to show why you're not allowed to do this kind of thing from another angle. Suppose I could. I will now show (1) that the _acceptance_ problem for regular, vanilla DFAs is undecidable and (2) that the intersection problem for _regular_ grammars is undecidable. Of course, both proofs are wrong (and both languages are decidable). Both proofs are intentionally silly, to show what would happen if you took these things to their logical extreme.
The fallacy here, of course, is that the mapping algorithm doesn't _get_ to decide what strings the DFA sees. Similarly, there's no "input formatter" that helps the DFA out either.
Now we need to see if the regular grammars contain a common string. To do this, we simply generate all possible strings that could come out of the regular expression, using nondeterminism. Once we've generated a particular string out of G_1 and one for G_2, we compare them.
For example, one string we'll generate for G_1 is ab/a a/ba. A string from G_2 is a/a b/b a/a. Clearly these strings have equivalent tops and bottoms, and so the grammars have a string in common. We also note that this obviously correspond to a solution to the PCP problem.
...and a more formal iff argument would follow - if this proof weren't flawed.
The flaw in the argument is, once again, that there's no TMs that are checking the strings - and doing extra computations that regexs can't. Once you make the regular grammars G_1 and G_2, it's all up to them to be intersecting or not. No more TMs are allowed to help them, and no TMs are allowed to only generate strings that they can deal with.
Another tricky issue I hope I didn't trip you up too much on: in the regular grammar above ab/a, a/ba, a/a and b/b are _atomic_ symbols. Regular grammars have no notion of symbols having tops and bottoms, of course. This is why I had to have a TM doing funky things in order to make it work - regexes aren't really that powerful.
In fact, the logic of proofs using computational histories apply to other computational models than Turing machines. For example, we've already proven that the SMM model can do precisely the same things as Turing machines can. So, you could (if you were masochistic) define a computational history syntax for SMMs, and use that for your proof.
Practical issue: of course, using SMM computational histories is likely to lead to a really messy proof. Also, it's probably a good idea to stick with the computational histories like in lecture or in the book, because then you don't have to prove their generality or explain as much.