Lab 13: Relational Database Managment Systems (RDMS)

Objectives


For this lab, you will create a small database to store information about music CDs. You will be using your work from this lab in the next two labs. So do it nice or do it twice!

Database Terminology

Refer to Chapter 13 in the book and Wikipedia for more detailed explanations of the database terminology.

A database management system, or DBMS, stores and helps users manipulate and find/view structured data called entities. A relational database management system, or RDBMS, stores different entities separately and creates relations between the related entities. Entities are items with certain attributes. For example an entity would be a person and the attributes would be the person's name, birth year, and shoe size. If a person has an attribute which is more complex, such as an address, then that attribute should be considered its own entity, as it also has its own attributes (street, city, state, etc.), different entities can share it (different people can live in the same address, etc.)

In a RDBMS, a person and the person's address would be a relationship between a person entity and an address entity. Once a database is created, you can enter data, modify the data, and view the data by combining different entities. This is done by sending queries to the RDBMS. Many databases are managed by using a standardized query language, SQL, but it is also common to have a graphical program that hides the SQL.

Microsoft Access is a RDBMS that lets you manage your database using a graphical user interface (GUI). You can also create graphical forms to manipulate your data and create reports to show results from a search/query.

In a RDBMS, entities of the same type are stored as rows (aka tuples) in a table. The columns (aka fields) of the table correspond to the entity's attributes, and each field is designed to hold a certain type of data, or data type, such as text or a number.

Example: Persons can be stored in a table called tPersons (where the "t" helps you remember that it is a table). The table may have the fields Name (data type: text), BirthYear (data type: number), and ShoeSize (data type: number) that will store a person's name, birth year, and shoe size, respectively. To uniquely identify a person it is good practice to add another field: PersonId (normally data type: number), because a person can have the same name, birth year, and shoe size, but still be different people! So, the PersonId will help us identify them. The table tAddresses is set up in a similiar manner, but with its own fields. To add the relationship that a person has an address, tPersons is extended with the field: AddressId, which is the unique identifier for an address in the table tAddresses.

There are several good reasons to split up the persons and the addresses into two tables. One is that the number of columns in tPersons is greatly reduced and easier to manage. Another is that we might have several persons living at the same address. By putting the persons and the addresses in separate tables, the address only need to be stored once in the table tAddresses instead of in several places in the table tPersons. If the same address was to be stored in several places, one could easily enter a slightly different address by mistake, which would cause the database to become inconsistent.

Create a Small Database for Music CDs using MS Access

  1. Start MS Access (usually under All Programs>Microsoft Office>...).
  2. Create a new database:
    1. Select File>New... or the new document icon.
    2. Select Blank database....
    3. Locate the desktop and give your database file the name lab13.mdb.
    4. Click on Create to create the database.
    You should now see the newly opened database. On the left hand side of the window, you should see Tables, Queries, Forms etc. If you click on them, you can view the tables, queries, forms, etc in your database.

    Newly opened database in MS Access

    Become familiar with this window! You will use it often. If you lose it, you can get it back by selecting View>Database Objects>Tables.

    You can right click almost anywhere in MS Access to get a popup menu. Sometimes you need to select an item before right clicking.

  3. You should now create a table that will contain a list of CDs that you own. It will be a first version/draft to get you started. Select Tables on the left side of the dialog window. Then double click on Create table in Design view. A big window should appear in which you can describe the attributes/fields of your CD table. Add the following fields: CDId with the data type AutoNumber, CDName, ArtistName and StyleName all with the data type Text. Choose the type form the drop down menue when you click on a white box under DataType. StyleName is for the style of music, e.g., Jazz, Rock, Classical, etc. If you like, you may put a description for each one, but since this table is (currently) pretty simple, you don't have to.

    Draft CD Table

  4. Before you are done with describing the CD table, you should make a field the primary key. The primary key is a field (or combinations of fields) that will have a unique value for each CD in the CD table. Place the cursor in the row for the CDId field. Right click and select Primary Key (or click directly on the primary key icon Primary Key

Icon in the main MS Access window). A key symbol should appear immediately to the left of the field name CDId.
  5. Save your table by using File>Save or Save Icon. Name your table tDraftCDs. After the table is saved you can close the table design window. Under Tables in your database dialog window you should now see your tDraftCDs table.
  6. Once a table is created, you can view it, add data to it, and modify it. Check the table design of your tDraftCDs by right clicking on the table and selecting Design View. Do not change anything. It should like before when you added attributes/fields to the table. Close the design view when you are done.
  7. Open your tDraftCDs to enter data by right clicking on the table and selecting Open. A window with your table data will appear. Each row corresponds to one CD.  Notice that you do not need to explicitly fill in the CDId field.  Since it is an AutoNumber field, it will be filled in automatically. The other columns represent the name, artist and musical style for the CD. The table should currently be empty. Add the following CDs:

    Draft CD Table Data

    Data that is entered into a database table is saved directly after you leave the field. So there is no need (or possibility) to save it manually, like you must do in the design view. However, if you modify the style of the table, such as the width of the columns, Access will ask if you want to save the layout (which you probably want to do).

  8. You have now created a database with information about music CDs! There are, however, a few undesirable things to fix before you will spend your nights entering information about all your CDs:
    • Each entry in tDraftCDs contains very little information about a CD. This problem will be ignored for the labs, since adding extra fields is simple thing to do.
    • There is no nice way to enter data or to display a list of the information in the database. This will be discussed and partly solved in upcoming labs.
    • The same information is written in several places in the current design. We have written the artist Abba twice and we have two CDs with the musical style Jazz. It would be much better to create separate tables for artists and styles and set up some relationships between the CDs, artists and styles. You will do this next!

Improve the Database Design: Normalize

Normalizing a database means to remove redundant data, that is, to remove multiple occurrencies of the same information. Instead of writing the same information several times in one table, the table is redesigned to have a relation to another table in which the information is stored only once.  The diagram below shows what we ultimately want. The boxes contain the different types of entities, the ovals contain the attributes of entities, and the diamond boxes describe the relationship between entities. The infinity signs (sideways 8) and the number one represent the type of relationship two entities have to each other--for instance, there is an infinity sign next to the CDs and a 1 next to the artist, along the PerformedBy relationship. This means that a CD can have only 1 artist, but an artist can have many (infinite) CDs.

Relationship Diagram

Above we did a draft for our CDs database. We're now going to create a better version (in the same file). We'll create tables for each of the types of entities that we have, artists, CDs, and styles, and then relate them to each other as shown in the diagram. To relate each CD in a CD table to an artist in an artist table, each CD must have a field with the identifier (aka foreign key) corresponding to a particular artist in the artist table. In the artist table, an artist is identified by its value in the primary key field.

In this section you will create a better designed CD table, move the artists and styles into their own tables, and setup relationships between the three tables. All of the data in the old CD table (tDraftCDs) will be used. By using some features of a RDBMS, you will not have to type in the data again!

  1. Create a new table in design view. It should look almost like the old tDraftCDs, but the artist and musical style should have slightly different names and use different data types. It is also nice to have a comment that explains what the IDs will be used for:

    Design of the CD Table

    Do not forget to specify the primary key and to save your new table! Save the new table as tCDs.

  2. Create a new table in design view that will contain a list of artists. Call it tArtists. It should have two fields: ArtistId (data type AutoNumber) and ArtistName (data type: Text). The primary key should be ArtistId. Do not forget to save the table!
  3. Create a new table in design view that will contain a list of musical styles. Call it tStyles. It should be very similiar to tArtists, but use slightly different names for the fields: StyleId (data type: AutoNumber) and StyleName (data type Text). The primary key should be StyleId. Do not forget to save the table!
  4. You now have three tables and it is time to "connect" them with relationships! Open the relationship window by either selecting Tools>Relationships... from the main Access menu, or by clicking on Relationships

Icon. You should first select what tables you want to create relationships between. Make sure to select tArtists, tCDs and tStyles. Do not include the old CD table. To select multiple items in a list, you can press and hold the Ctrl key while you click on the items you would like to select. End with clicking on Add. You should see the relationships window with three tables and their fields. Bold fields indicate primary keys. It is also possible to rearrange the tables if you do not like the automaticaly generated order. Example:

    Relationships, not connected

    If you forgot a table or picked the wrong table, you can add more tables by right clicking on background and selecting Show Table...

  5. The following relationships exist between the tables:
    • One artist can have many CDs.
    • One style of music includes many CDs.
    It is also nice if we can require the data in the database to obey these relationsships, that is, to never have a CD without a corresponding artist or style of music.

    Create one relationship at a time by dragging the primary key from one table and dropping it on a foreign key in another table, for example drag the ArtistId from the table tArtists to the ArtistId in the table tCDs. The following relationship window will appear:

    Artist-CD relationship

    Make sure "Enforce Referential Integrity" is checked. This is to enforce the relationship to always exist. Also note the Relationship Type at the bottom of the pop-up window: One-To-Many. Click on Create when you are ready to add the relationship.

  6. Add the second relationship for musical style on your own.
  7. The relationships between the tables should look like the following. The infinity symbol symbolizes "many".

    Relations, connected

    You have now successfully designed a relational database! The next step is to extract the data from the old CD table tDraftCDs and put it into the three new tables tArtists, tStyles and tCDs! Close the relationships window (answer yes on the "save layout" question).

Transfer the Old Data Into the New Improved Database

In your old table you had only four CDs. In reality, the number of rows would be many more. It would be a painstaking experience to retype the information. In this section you will extract and transfer the data without any unnecessary typing!

The key is to use a query to view (or extract) certain columns from a table, and to paste the results into the new tables.

  1. In your database window, select Queries (on the left). Then double click on Create query in Design view. When you get the option to add tables, only add the old table tDraftCDs, since that is where all data is located.
  2. The lower half of the query design window has a large number of columns. When you design a query, you should use one column for each field you want to do something with. In more complicated queries you can combine fields from different tables, restrict what rows to include, and much more. The query you will design here is quite simple. You only want to get the field ArtistName from the table tDraftCDs. In the first column, select the desired field.

    Query design for ArtistNames

    Save the query and close the query design window (or close the query design window and save the query upon request). Give this query the name qArtistNames. The "q" is a reminder that it is a query.

  3. Now open your newly created query! You do this in the very same way you would open a table to enter data. (Hint: right click...).

    Your query should give you the names of all of the artists in your database. When you open the query you should see the rows and columns that you requested in the query. A query is just another way of viewing your table data and is, therefore, sometimes simply called a view. If you change any information in the query result, the original table (tDraftCDs) will be updated.

    Query result

    These names will be used later in the table tArtists. Before you can use them, you need to get rid of unwanted duplicates. This means you must modify the query.

  4. Close the query result window and open the query in design view. Right click on the background in the top half of the window to bring up a menu. Select Properties... at the very bottom of the menu. This will show the properties of your query:

    Query properties window

    Change the settings for Unique Values (a few lines down from the top) from No to Yes. This will give you only the unique values in the table. When you are done, close the property window, close the query design window (save changes to the query) and open your query to see the results. It should now have no duplicate names.

    Query result

  5. Select all of the rows from the first/only column of your query results by clicking on the column title (ArtistName). All rows should be selected (dark or black colored). Copy by pressing Ctrl-c or by using Edit>Copy.
  6. The results from the query should now be pasted into the tArtists table. Open tArtists to enter data. The table should be empty. The first column corresponds to the ArtistId and will get values automatically, so there's no need to touch it. In the second column you should add all of the rows from the query!

    Select the second column by clicking on the column title ArtistName. Paste the previously copied rows by pressing Ctrl-v or use Edit>Paste. Answer Yes to the question if you want to paste the rows. After a successful paste, you can close both the table tArtists and the query result window.

  7. You have now copied the unique information for the artists from your old CD table. Do the same thing for the style of music. That is, repeat steps 2 to 6 for the StyleName field and the tStyles table! Save the query as qStyleNames. The result should only be two rows, since there are only two styles of music in the old table (i.e., Jazz and Pop/Rock).
  8. You should now add information about all the CDs to the new CD table. You would like to have all of the rows from the old CD table, but the artist and style names should be replaced by their respective identifiers in tArtists and tStyles. You need a query that takes all of the rows in the tDraftCDs, finds the row in tArtists that has a matching ArtistName, finds the row in tStyles that has a matching StyleName, combines these rows, and gives the fields CDName, ArtistId and StyleId as a result.

    Open a new query in design view. Add the tables tDraftCDs, tArtists and tStyles. Connect the fields that should match. That is, drag ArtistName in the tArtists to ArtistName in tDraftCDs. Do the equivalent for StyleName. Then make sure the query shows the three fields CDName, ArtistId and StyleId in the bottom half of the query window. Make sure the order is correct. Save the query design as qDraftCDs.

    Query design

  9. Close the query design window and run the query. You should see the four CDs and the IDs for the artist and the style of music:

    Query design

    Select all rows and all columns. The easiest way is to click and drag over the column titles. Copy the data with Ctrl-c or Edit>Copy. Open tCDs to enter data and select the three columns CDName, ArtistId and StyleId. Paste in the data with Ctrl-v or Edit>Paste. Accept to insert the four rows. Close tCDs.

  10. You are now done with your database! You might think it was a lot of work for four rows of data, but what you just did would work for thousands of rows without any extra typing!
  11. To verify that you no data has been damaged or lost, design a query based on the tCDs, tArtists and tStyles tables. Note that you do not need to specify any relationships or matching requirements between the tables. They are already set up in the design. Have the query show the fields CDName, ArtistName and StyleName. Save the query as qCDs and open it to see the query result. It should contain the following four rows:

    Query design

  12. Before you close your database or Access, select Tools>Database Utilities>Compact and Repair Database... and accept to open the lab13.mdb. This will remove unneeded space in your database file, which otherwise can become very large!
  13. Close all your database windows, and close Access.
  14. Create a lab13 directory in your fit100 directory on dante. Copy your lab13.mdb into it. 

    Select Tools>Database Utilities>Compact and Repair Database... before you close your database OR after you have worked with the database for a while. This will decrease the size of your database file. Also close your database file and Access before copying your database file to your dante account, otherwise you may end up with an empty file!

  15. Turn in your Spreadsheet from Lab 11

    Due Date: Turn in at the beginning of your lab session on Monday or Tuesday, November 26 or 27.

  16. Turn in your Extra Credit Lab 12

Due date: Due before 10pm on Tuesday, December 11, 2007.

  1. Turn in your Database from Lab 13

Due date: Turn in at the beginning of your lab session on Wednesday or Thursday, November 28 or 29.

  1. Instead of the usual Word Document, just turn in the modified Access database. Because this file isn't a Web page, there's no need to link it. Just upload it in Collect It.

Collect-It Turn-in Area

Click Here to Turn In files for Dave's lab sections AA & AB

Click Here to Turn In files for Keith's lab sections AC & AD

Click Here to Turn In files for Sam's lab sections AE & AF