Lab 8 - Creating formulas in Excel
Introduction
This lab is an introduction to writing formulas and using functions
in Microsoft Excel. In
this lab you will be using Excel to manipulate ana analyze data from
the 2000 Census. All of the data used in this lab is from the
US Census Bureau's web site at www.census.gov.
These numbers are the populations counts for the state of
Washington; they are grouped by age.
For an introduction to Excel and to spreadsheets in general, see
Chapter 13.
1. Open the Lab 8 worksheet
Click on the lab08 link below to download the Excel worksheet.
Save the file to your Desktop, or somewhere
else that will be easy to find later. Make sure that you've opened the worksheet inside
Excel and not inside the web browser. If you have trouble opening the
worksheet file, ask your TA for help.
lab08.xls
A spreadsheet is a table that stores data and
calculations. Each "box" in the spreadsheet is a cell.
The cells are each in a row and a column. The rows are numbered (1,2,3,...).
The columns are lettered, (A,B,C,...). We can refer to a cell
by its column letter and row number. For example, the cell with the words "Age" is cell A1.
2. Formatting a spreadsheet
Right now the spreadsheet just looks like a jumble of numbers. We can make the
spreadsheet easier to read by changing the appearance of the cells.
The spreadsheet contains US Census data for the state of
Washington. The data is broken down by age group and gender. Currently
it is difficult to tell what the categories are and where the data
starts.
Let's make the column titles bold so they will be easier to
spot.
Click in cell A1 and drag the mouse to the right so that the top
three boxes (A1, B1, and C1) are selected. Press CTRL+B to make
the text in these cells bold.
Extra Notes:
- You can start dragging
from any corner of your cells to make a selection.
- Once these cells are highlighted, it
doesn't matter where your mouse pointer is when you push Ctrl-B.
- To make something bold, you could have selected the
B button from the toolbar or opened a menu at Format > Cells... and selecting the Font
tab.
The numbers in this sheet are still somewhat difficult to
read. We can fix this by telling Excel
to display them with commas (e.g. 1,234 instead of 1234).
By default, Excel just prints the numbers without commas or dollar signs. If your numbers were in units of time or money,
you can tell Excel to format them appropriately (e.g. 14:00, 2pm, or $14).
Important: Each cell has at least two parts: (1) the cell's content (e.g. 1234) and (2) the cell's presentation (e.g. US currency, time, real numbers, italicized, bolded, Times New Roman, etc...). It is important for you to remember that these two both affect how a cell is displayed.
Now we'll add the commas to change the presentation. Click and drag so that all of the values
in columns B and C are highlighted.
Then go to Format > Cells...
When the formatting window appears, select the Number
tab to change how numbers are displayed. Select Custom from the
menu and then the #,##0
option as shown below. Now, commas will be inserted
between the thousands and hundreds place. This does not change the value of the cell, just how it is displayed.
3. Using functions
Now we will use a function to calculate the total number of males and
females recorded by the census. We'll use one of Excel's built-in
functions to calculate the total of each column. To do this we will write a formula to put inside one of the cells.
Important: In addition to a cell's content (e.g. 1234) and presentation (e.g. US currency, time, real numbers, etc...). A cell can have a formula instead of a number. So a cell can have three parts: (1) A formula (e.g. =1+2+3), (2) a value (e.g. 6), and (3) a number format (e.g. $6).
We're going to put the total sum of the values in column B into cell B21. The SUM function is used to compute the sum of a range of
cells. Once you've entered the formula correctly, Excel will display
the result of the SUM function. These next instructions will show you how to use the SUM function.
You must start by typing
=
to indicate that you are typing in a function. Otherwise, Excel will not evaluate your input as a formula.
The sum function takes the form =SUM(cell1:cell2)
,
where cell1 is the first cell in the range and cell2 is the
last cell in the range. All of the
cells between cell1 and cell2 are calculated in the function. You can type in
the names of cell1 and cell2 to indicate the range. Or, you can select
the range with the mouse while you type in the formula. We will try it with the mouse.
Select cell B21.
In cell B21, type in the following text: =SUM(
. Next,
with your mouse, highlight all of the numbered cells in column B. Notice that Excel
automatically calculates the range based on your highlight.
Now, type a
closing parenthesis, )
. B21 should
now contain the text =SUM(B2:B20)
Press Enter to insert the formula into the cell. The formula will
be replaced by the calculated sum of column B. However, if you click on
the cell, the function will be shown in the function box above the
spreadsheet even though the data is shown in the cell. You can also
edit the formula by double-clicking the cell.
Repeat the steps above to find the total size of the female
population and place it in cell C21.
Extra Note: You can access the functions another way by clicking on the fx button or by selecting Insert
> Function from the menu.
Excel has a large number of built-in functions that perform many
different operations. Feel free to go in and explore them.
You can find a list of all the available functions here. In this pop-up window, the drop down menu lists the categories of functions. The functions
themselves are displayed
in a list. You can click on a function to see a description of what it does and
how to use it.
Let's label this new data: Put the words TOTAL in cell A21. Make this cell bold also (Ctrl-B).
4. Mathematical functions and Filling
Excel can also handle regular ("infix") arithmetic statements. For example, instead of SUM(1,2)
you could write 1+2
. You can
use +
to add numbers, -
to subtract, *
to multiply, and /
to divide just as you did in JavaScript. Excel uses the standard order-of-operations for math, so make sure you indicate mathematical precedence by using parentheses. For example, 1+2*3
is different than (1+2)*3
.
Now let's add up the female and male populations to get the total for
each age group. We will place this total in column D.
Click on cell D2. In it, we will put the total number of
people under age 5.
Use the +
operator in
our function instead of the SUM
function (although SUM would work, too).
Once again, enter =
to indicate we are using a function.
Then click cell B2 to add this cell to our formula. Type a +
sign and then click on cell C2. The formula should appear as =B2+C2
(you can also enter this purely with the keyboard)
Press Enter to make the total population under age 5 appear in this cell.
We've now created a formula to calculate the number of people within a
single age group. It would be nice to have totals for each age
group, but we rather not type the formula each time.
Fortunately, Excel provides a shortcut for creating sets of similar
formulas based on another formula. We will use this feature to fill in
the rest of column D with group totals.
Click on cell D2 and drag down to D20 to highlight the cells you
wish to copy your formula into.
Go to Edit>Fill>Down to fill the selected cells with
their correct formula.
Click on each cell to see how excel filled each one. Note that this feature is
very handy. This feature looks at your formula B2+C2
and changes the
number 2
for each new line that it fills, changing it to match the newly filled box. So, B2+C2
becomes B3+C3
, B4+C4
, B5+C5
, and so on... This might allow mistakes (maybe you really wanted it to change like this: B1+C1, B1+C2, B1+C3, B1+C4
.. where B1 stayed absolute or fixed). Remember to check that the cells were filled out with the right formula.
Extra, advanced stuff: If you want more control over references in formulas, click on Help->Microsoft Excel Help, Go to the Table of Contents>Working with Data>Formulas> Creating Formulas> About formulas. Visit the section on relative and absolute references to figure out how to control exactly how Excel automatically fills new cells.
Finally, put the text Totals (in bold) for cell D1.
5. Copying and pasting formulas
You can also copy and paste formulas (e.g. =SUM(B2,C2) or B2+C2, instead of the value: 192,241) in Excel. Copying and pasting acts
like the fill operation we performed above: The formulas in the original cells
make references. In the new cell, these references are rewritten relative to the
original cell. (e.g. If the original cell was adding the two cells to the left of it, the new cell will add the two cells to the left of the new cell.)
Notice that we have not yet calculated the GRAND total population of
Washington State in 2000.
Let's do so now and put that into D21 using copy and paste.
Click on cell D2 again. Copy the formula with Edit > Copy
or CTRL-C.
Now click on cell D21 and paste the formula with Edit > Paste
or CTRL-V The total population should now be in D21. Also, notice that the formula for D21 is different than the formula for C21 and B21
The rules that Excel uses to copy and paste formulas can initially seem
confusing. The best way to get a sense for these rules is to experiment
with copying and pasting formulas. Go ahead and try it out until you
have a good sense of how it works.
Sometimes you just want to copy the value of the cell and not the formula. One way to do this is to go to Edit>Paste Special and select the bullet for Values to just paste the calculated value. Paste Special is a powerful way to control what gets pasted, you conduct complicated types of pastes using this function.
6. Automatic updating of formula results
When you change data in cells that are used in the formula, the formula
is automatically recomputed and the displayed result is adjusted
accordingly.
Let's say we found out the census data for the 20 to 24 age group was
incorrect and that there were actually only 189,370 females in that age
group. We would want to change the spreadsheet and have all of the
totals reflect those changes.
Click on cell C6, the count for females aged 20 to 24, and type in
the number 189370
.
Watch cell D6 as you change C6. The value in D6 will change as you
update C6, and so will the grand total at the bottom of the sheet. This
is a huge advantage to using formulas in Excel. You don't have to go
back through your data and update everything that depends on the value
of C6. Excel does it for you!
7. Boolean operators and the IF function
Excel doesn't just operate on numbers. It can also perform logical
operations using Boolean values (true
and false
).
Suppose you are asking: Does a particular age group have more males or females?
and Which gender has the greater population? You could
do this arithmetically: Subtract the number of females from
the number of males, and use numbers
to represent the difference. But this could be unclear.
It would be much simpler if you had a column for the prominent gender: its cells displaying
"males" if there were more males or "females" if there were more females.
The IF function allows you perform a logical
evaluation of some expression and to perform an action based on the
result of the expression. You can tell Excel what to display if the
expression is true and what to display
if the answer is false. This is similar to the if statements we
encountered in Javascript.
The if function uses the following syntax:
=IF(
logical
expression,
result if the expression is true,
result if the expression is false)
- Look at the under 5 age group again. Let's ask Excel to
determine the gender that has a greater population, and to print the
result in the next available cell, E2.
- Start by typing in the first part of the function =IF(
- Now we need to determine our logical expression. Since we want
to know which gender has a greater population, let's test to see if the
value in the males column is greater than the value in the females
column. We can represent this test in Excel as
B2>C2
.
Enter this expression and add a comma. The function should now read
=IF(B2>C2,
- We need to specify what should be displayed if the test
condition turns out to be true. If our test is true it means that there
are more males than females, so we would want it to display the string "males"
in the cell. If you want the
IF
statement to print exactly
what you have entered, you need to put it in quotes. Add another comma.
Your function should now read =IF(B2>C2,"males",
- The last thing we need to do is specify what to display if the
test condition is false. If it is false there are more females than
males so it should be the string "females". For now,
let's ignore the possibility that there are an equal number of males
and females. Your function should now read
=IF(B2>C2,"males","females"
- Complete the function with a parenthesis
=IF(B2>C2,"males","females")
, and press the Enter key to place it in
the cell. Once you have pressed Enter, Excel will calculate the value of the formula and put the display the answer. The cell should now display: males
- Fill down the rest of the columns so that all age groups have
the same function.
- Give the column the label Prominent Gender. (The label is bigger than the column, we'll format it nicely in step 8)
8. Formatting the whole page
Let's clean up the spreadsheet and make it a little more presentable.
- Let's make the font a little bigger on the headlines. Select all of row 1 by clicking on the 1 in the
far left margin. Select the font size drop down menu from the
toolbar and change
the font size to 12. Notice that the cells automatically get taller to
accommodate the larger text.
- Notice how the text Prominent Gender in cell E1
overlaps the end of the cell. We can change the width of the cell by
clicking and dragging the line between the E and F
columns in the header. You can also double click on the dividing line and it
will automatically resize the cells in the column so that the longest
cell entry fits entirely
in the cell. Sometimes finding the exact spot to click and drag can be
difficult. You can tell that the cursor is in the proper position to
resize the cell when the cursor changes (see the picture).
- It's hard to differentiate between the original data and the
computed totals, so let's add visual lines to our sheet to make
that distinction clearer. Click and drag to select all cells from B2 to
E20.
Then go to Format > Cells and click on the Border
tab. We're going to specify a thick border along the bottom of the
cells. Click on the
options shown in red below:
(In Excel, you must choose the style of the line before you choose the type of border.)
- Now let's put a thick border around the part of the worksheet
containing data. Select all of the cells with data in them. Choose
Format > Cells and return to the Border tab.
This time click on the "Outline"
box at the top of the window. Also, make sure you specify what
thickness of border you want
on the right hand side of the window. Remember, you must choose the
line width before you choose the border type.
- Finally, let's add a margin to the top and left of our
spreadsheet by inserting blank rows into these areas. To insert a row
above the top of the data, click on the label 1 in
the far left margin. Then choose Insert > Row
from the menu. This will insert a new row above the current data, and
shift the other rows down. Do the same to add a column by clicking on
the label A on the header above and then selecting
Insert > Column from the menu.
Your final spreadsheet should look like this: (Don't worry if you made your borders look
different. Just change the borders to look the same by using Format>Cells, Borders)