``````wget https://courses.cs.washington.edu/courses/cse391/23au/lectures/6/questions6.zip
unzip questions6.zip
``````

1. Suppose that we have a file named `words.txt` with the following contents

``````These are some words
a11 0f the5e w0rds c0n7ain number5
S0me 0f these w0rds do
``````

• Write a command that identifies all words which are exactly four characters long.

• Write a command that identifies all words which are exactly four characters long and contain only letters (both upper and lowercase).

• Write a command that identifies all words which are at least four characters long and contain only letters (both upper and lowercase).

Solutions

``````grep -E "\<....\>" words.txt
``````
``````grep -E "\<[a-zA-Z]{4}\>" words.txt
``````
``````grep -E "\<[a-zA-Z]{4,}\>" words.txt
``````

2. Suppose that we have the following file named `vegetables.txt` ðŸ¥¦

``````broccoli
asparagus
potato
lettuce
zucchini
brocccccccoli
``````

• Come up with a `grep` command that correctly identifies all vegetables that have two or more consecutive `c`‘s in their name.

• Come up with a `grep` command that correctly identifies all vegetables that have two `c`‘s anywhere in their name.

Solutions

``````grep -E "cc+" vegetables.txt
# or
grep -E "c{2,}" vegetables.txt
``````
``````grep -E ".*c.*c.*" vegetables.txt
``````

3. Using the file from Q2: Come up with a `grep` command that correctly identifies all vegetables that have two or more consecutive repeated letters in their name.

Solutions
``````grep -E "([a-z])\1+" vegetables.txt
``````
4. Suppose we have a file `kitkats.txt` with the following contents:

``````kit kat
kat kit
my favorite part of the kit is the kat
cats do not like kit kats
this line only has kit
this line only has kat
``````
Write a command that finds all lines which contain `kit` and `kat` in any order.

Solutions
``````grep -E "kit" kitkats.txt | grep -E "kat"
``````
5. Suppose that we have the following file named `emails.txt`. This file contains a user’s first and last name, followed by a comma, and then their email address. What is a `grep` command that determines which users have exactly their last name as their email

``````larry ruzzo, ruzzo@cs.washington.edu
zorah fung, zfung@yahoo.com
hunter schafer, hschafer@uw.edu
bennet goeckner, goeckner@math.uw.edu
ruth anderson, andersonr@gmail.com
``````
In other words, our command should correctly identify that Larry Ruzzo and Bennet Goeckner have their last names as their email address.

Solutions
``````grep -E "[a-z]+ ([a-z]+), \1@[a-z]+\.[a-z]+" emails.txt
``````
6. The backend team at faang needs your help - we have lots of new products and they’re flying off the shelves like crazy (apparently you can sell happiness). In order to track all these transactions, each sale is assigned a unique ticket id. A ticket id is defined by the following properties:

• It must contain exactly 16 letters (upper or lowercase) and numbers
• To improve readability, the letters may optionally be grouped into segments that are multiples of length four delimited by dashes. However, the string may not end with a dash.

The following are valid ticket ids:

``````1234567891011112
1234-4567-8910-1112
aBcD-Ef79-8122-fd01
aBcDEf798122-fd01
``````
The following are not valid ticket ids:
``````12345                            #too short
1233333333333333333333333333     #too long
1234-4567-8910-11?2              #illegal character
1234567891011112-                #ends with dash
``````

• Come up with a `grep` command that identifies valid ticket id’s in the file `tickets.txt`

• Write a command that identifies how many unique valid ticket id’s are in the file `ticket.txt`.

• Challenge: Come up with a `grep` command that identifies valid ticket id’s with the added constraint that if there is a single dash, all groups of four must be separated by a dash. (i.e. Now `aBcDEf798122-fd01` is not a valid ticket id).

Solutions

``````grep -E "^([a-zA-Z0-9]{4}-?){3}[a-zA-Z0-9]{4}\$" tickets.txt
``````
``````grep -E "^([a-zA-Z0-9]{4}-?){3}[a-zA-Z0-9]{4}\$" tickets.txt | sort | uniq | wc -l
``````
``````grep -E "^[a-zA-Z0-9]{4}(-?)[a-zA-Z0-9]{4}\1[a-zA-Z0-9]{4}\1[a-zA-Z0-9]{4}\$" tickets.txt
``````