Shell

The terminal is a text-based user interface for interacting with your computer. Inside the terminal is the shell, which is often denoted by the $. The shell is a program that allows the user to interact with the operating system + applications.

The default way we’ll access Linux is by using ssh, which lets you remotely log in to other computers such as attu. Rather than having to physically go to them, we can use ssh to connect to them remotely.

Basic Shell Commands

commanddescription
pwdPrint current working directory
cdChange working directory
lsList files in working directory
manBring up manual for a command
exitLog out of shell

Flags and Arguments

Flags are prepended with - and change programs’ behavior slightly. For example,

ls -l

-l is a flag for ls. In this case, the -l flag for ls lists the content of the current directory in “long listing format”.

Commands can also take arguments. For example,

ls dir1

dir is an argument. In this case, this lists the contents of the directory named dir.

Commands can take multiple arguments and flags. For example,

ls -a -l dir1 dir2

lists the contents of dir1 and dir2 in long listing format, showing hidden files (-a).

Some programs, like ls, let you combine flags into one -:

ls -al dir1 dir2

man pages (documentation)

Documentation for Linux is built-in and can be accessed using the man (manual) command.

For example,

man ls

provides the documentation for ls.

Some helpful notes for reading man pages:

  • most man pages have specific components:
    • a summary or synopsis that shows the structure of the command and important flags and arguments
    • a longer description that explains each flag and argument
    • examples of how to use the command in common use-cases
  • you can search for words with /: e.g. typing /reverse searches for "reverse" exactly
  • if you type in h, you can see all the commands you can use to navigate a man page

The synopsis has some important syntax. For example, man ls has:

SYNOPSIS
       ls [OPTION]... [FILE]...
  • items within [] are optional; in this case, this means that ls can optionally take an OPTION or a FILE, but neither is required.
  • the ... mean that the command takes one or more of the preceding item; in this case, this means that ls can take one or more options and one or more files

Compare this synopsis with the next few commands; how are they the same, and how are they different?

Directory Commands

commanddescription
lsList files in working directory
pwdPrint current working directory
cdChange working directory
mkdirMake a new directory
rmdirRemove the given directory (must be empty)

Relative Directories

directorydescription
.References the working directory
..References the parent of working directory
~Refers to the **home** directory
~/DesktopYour desktop

File Examination Commands

commanddescriptionexample(s)
cat“Print” out files to the consolecat file.txt
lessBrowse a file, with search, scroll, and other featuresless file.txt
moreAn alternative to less with different keybinds and featuresmore file.txt
head“Print” out the first 10 lines of a file; use flags to change this behaviourhead file.txt, head -n 5 file.txt
tail“Print” out the last 10 lines of a file; use flags to change this behaviourtail file.txt, tail -n 5 file.txt
wc“Print” out the number of lines, words, and characters in a file; use flags to change this behaviourwc file.txt, wc -l file.txt

Searching and Sorting Commands

commanddescriptionexample(s)
grep“Print” out the lines of the input file(s) that contain a specific string.grep "berry" fruits.txt veggies.txt shows the lines in both fruits.txt and veggies.txt that contain the string "berry".
sort“Print” out the contents of the input file, but with lines sorted lexicographically.sort file.txt
uniq“Print” out the contents of the input file, but remove (adjacent) repeated lines. Often used with sort.uniq file.txt
findSearches the filesystem for file(s) that match a pattern.find -name "*.java" finds all files in the current directory and its subdirectories that end in .java. Note that this is more powerful than ls *.java!

All of these commands have many options. You’ll have to look at the man pages for sort, uniq, and find for your homework!

Compiling and Running java programs

  • javac HelloWorld.java compiles the contents in HelloWorld.java (can replace this with any other .java file)
  • java HelloWorld runs HelloWorld.java
  • java HelloWorld.java compiles and runs HelloWorld.java

Standard Streams

Processes (~ programs) in Unix have 3 standard streams, which are “abstract” locations that tell a process where to read input from and write output to. They are:

streamJava equivalentdescription
Standard Input (stdin)System.inwhere the program gets user input; defaults to terminal input
Standard Output (stdout)System.outwhere the program sends “normal” output; defaults to printing to terminal
Standard Error (stderr)System.errwhere the program sends error output; defaults to printing to terminal

Many commands will default to using stdin for input when some arguments aren’t provided; for example, try running grep "a" and then typing many sentences into your terminal.

As a programmer, you don’t have to worry about exactly how these work; the shell and operating system manage them for you! However, we’ll often “redirect” these streams elsewhere.

Input and Output Redirection

Standard Output Redirection

The > operator allows you to execute a command and redirect its standard output to the given file, instead of printing it to the console.

For example,

grep "berry" fruits.txt > berries.txt

finds all lines which contain "berry" in fruits.txt, and writes it to berries.txt instead of printing it to the console.

The > operator overwrites the file. In contrast, the >> operator appends to a file.

grep "berry" fruits.txt >> berries.txt

The left-hand side of > and >> should be a command. The right-hand side of > and >> should be the name of a file.

Standard Input Redirection

The < operator allows you to use the contents of a file as the contents of standard input, instead of what the user types in to the console.

For example, the following command finds lines containing "berry" within fruits.txt:

grep "berry" < fruits.txt

This looks similar to using grep "berry" fruits.txt, but there is a slight nuance; when using <, grep is now reading from standard input (not directly from a file). This difference becomes more important in later sections!

The left-hand side of < should be a command. The right-hand side of < should be the name of a file.

Standard Error Redirection

The 2> operator allows you to execute a command and redirect its standard error to the given file, instead of printing it to the console.

For example,

javac HasSomeErrors.java 2> errors.txt

would compile HasSomeErrors.java, and write any error output to errors.txt instead of printing it to the console.

Note that > and 2> can point to different files; this is helpful in splitting up logs and debugging.

javac *.java > output.txt 2> errors.txt

The left-hand side of 2> should be a command. The right-hand side of 2> should be the name of a file.

Pipes

The | operator is called a pipe. You use pipes two “link” two commands together. Consider:

command1 | command2

In order, the |:

  1. executes command1
  2. then, executes command2, using the standard output of command1 as the standard input to command2

Conceptually, it is shorthand for the following sequence of commands:

  1. command1 > filename
  2. filename < command2
  3. rm filename

More command line operators

And operator

The and operator (double ampersand) is put between two commands, e.g. command1 && command2:

  • if command1 succeeds, it then runs command2
  • if command1 fails (e.g. when running javac CompilerErrors.java and getting a compilation error), then command2 is not run
  • useful when command2 depends on command1 succeeding

(this behaviour comes from short-circuiting in Boolean expressions)

Or operator

The or operator (double pipe) is put between two commands, e.g. command1 || command2:

  • if command1 succeeds, then command2 is not run
  • if command1 fails (e.g. when running javac CompilerErrors.java and getting a compilation error), it then runs command2
  • useful when command2 is a fallback for command1

(this behaviour comes from short-circuiting in Boolean expressions)

Then operator

The then operator (semicolon) is put between two commands, e.g. command1 ; command2. It runs command1 and then command2, regardless of whether or not command1 succeeded or failed.

echo: print arguments

echo is a command that prints out its argument(s) to standard output. It’s the shell equivalent of System.out.print() in Java or print() in Python.

echo "Hello, world"

xargs: convert stdin to arguments

In Week 2, we talked about standard input (stdin) and command-line arguments being different concepts. Frequently, you want to convert standard input to an argument (often as part of chaining many | commands). xargs is a command that lets you do just that!

For example, consider the following command:

ls *.java | xargs javac
  • ls *.java outputs lines to standard output
  • | takes the standard output of ls *.java and sends it to the command to its right as standard input
  • but, javac only takes in files to compile as arguments - not standard input
  • so, we use xargs to convert the output of ls *.java to arguments for javac
  • you can think of this as a short form for these three commands:
    1. ls *.java > toCompile.txt
    2. xargs javac < toCompile.txt
    3. rm toCompile.txt

find: recursively search directories

Using ls *.java only gives you the files in the current directory that have the .java extension. The find command lets you search within subdirectories as well.

For example, the following command finds all Java source files in the current directory and its subdirectories.

find -name "*.java"

You will often pair find with xargs (to apply an operation to all files that match a pattern). For example, the following compiles all Java files in the current directory and its subdirectories.

find -name "*.java" | xargs javac

cut: simplify complex strings

The cut command lets you manipulate strings from standard input.

The -c (character) flag lets you get characters at certain (1-indexed) indices or ranges. Here are a few examples:

  1. echo "abcdef" | cut -c2 will produce b
  2. echo "abcdef" | cut -c2-5 will produce bcde
  3. echo "abcdef" | cut -c2,1,4 will produce bad

The -d (delimiter) flag lets you split up input into (1-indexed) fields that are separated by the delimiter, which you can then access with the -f (field) flag. A common example is to parse CSVs (comma-separated values): echo "a,b,c,d,e,f" | cut -d, -f1 will produce a

Application: parsing logs

In the video, we showed one application of cut: parsing complicated log files. On attu, /cse/web/courses/logs/common_log outputs a log of all requests the CSE courses websites get. However, just using tail on this gives us too much information - it’s hard to focus!

(try running tail -f /cse/web/courses/logs/common_log yourself first)

Instead, we can use the power of pipes and cut to pull out a specific field of the log, like the requested page.

tail -f /cse/web/courses/logs/common_log | cut -d\" -f4

Here, we’re escaping the " as it has a special meaning in the shell, and we’re grabbing the 4th field.

We can also use the stdbuf command to get more instant, non-buffered input, and look for requests just to 391:

tail -f /cse/web/courses/logs/common_log | stdbuf -oL cut -d\" -f4 | grep "391"

tee: log output to console and a file

The tee command is useful because it lets you send stdout to a file and to the console at the same time which is helpful when you want to see the output on the shell in real time, but also log what happens for future reference.

Type a command like java Mystery.java | tee tee_out.txt, which runs Mystery.java and sends the output to the shell and to the file “tee_out.txt”.

Note that this only sends stdout to the file and outputs stderr before stdout on the console, if you want to send both to the file use “2>&1”. The command should end up looking like this example: java Mystery.java 2>&1 | tee tee_out.txt.