This is a bit of a long story, but fortunately it took me only a few minutes to debug and may help you understand the slightly shaky moments in lecture 6. At the end, I'll discuss what I was _actually_ trying to teach with the loop.c example. Feel free to respond with questions. The file loop.c looked like this: #include #include int main(int argc, char**argv) { for(;;) { // an infinite loop sleep(1); printf("%s\n",argv[0]); // do the printing } return 0; } Our sed substitution was supposed to replace the // through the end of the line with /* through the end of the line */. In other words, switch the style of the comments. That is easy enough: sed -e 's+//\(.*\)$+/* \1 */+g' loop.c We spent time in class explaining why this crazy sequence of characters should do what we want. But no matter how hard I tried, I kept getting this output: #include #include int main(int argc, char**argv) { */or(;;) { /* an infinite loop sleep(1); */ printf("%s\n",argv[0]); /* do the printing } return 0; } And now I understand why. I wrote loop.c on a Windows machine and my emacs there is set to end each line with a "carriage return" (\r) and then a newline (\n) as is the norm on Windows/DOS systems. Recall the default on Linux is just the newline (\n). Little did I realize that our discussion about this today was the source of my problem! Now a digression on what these characters 'mean'. They are anachronisms from typewriters -- carriage return moves the 'carriage' (the thing that types) all the way to the left _without_ moving it down one row. So it's like Windows thinks \n is 'just move down a row' (so the \r is needed to also get back 'to the left') whereas Linux thinks \n is 'move down a row and back to the left'. Now back to loop.c and sed. It read in the second line, which thanks to the fact that loop.c had Windows style line-endings, had a \r at the end that attu's sed _does_ read in since it is not the \n that gets skipped. So the line initially read was: for(;;) { // an infinite loop\r So the part that matched the \(.*\) was // an infinite loop\r So after sed's substitution we have for(;;) { /* an infinite loop\r */ Hmm, that looks scary -- a \r in the middle of the line. So the shell when printing this line figured we wanted to move back to the left when we had three characters left to print (a space, a *, and a /) and happily printed: */or(;;) { /* an infinite loop Sure enough, running dos2unix on loop.c fixed the problem, as did sed -e 's+//\(.*\)$+/* \1 */+g' -e 's/\r//' loop.c which just does a second pass and takes out the first \r it finds on any line. Now what I never got a chance to say because I couldn't get the script working is that this replacement is broken if you had a line in your file with some "other" kind of //, such as within a string, like: printf("here are two slashes // -- neat huh"); since sed would substitute to make printf("here are two slashes /* -- neat huh"); */ You could say not to do the substitution if there was a " between the // and the end of the line (replace the .* with [^"]), but that is wrong for // this is a "comment" with quotes Things like this come up often and can be difficult and/or mathematically impossible to solve with regular expressions. So sed substitutions are often "usually" right but can be "not what you meant" for unexpected lines of input. _______________________________________________ cse303 mailing list cse303@cs.washington.edu https://mailman.cs.washington.edu/mailman/listinfo/cse303