In this week's section, we'll practice writing SQL queries for several example databases. As we saw in lecture yesterday, we can use phpMyAdmin (which comes with MAMP) to create a database storing different tables (e.g. games database with a single table called Games).
For general query practice with a few practice tables, remember that you can you can also use the CSE 154 Query Tester, which is a tool to practice more queries.
By the end of this section, you should be able to:
.sql
file.
In this section, we'll continue to use setup-games.sql
(from Friday) as well as a new
dataset setup-fooddata.sql
which includes nutritional data from the USDA food database.
If you set these up on your own computer, make sure to follow the phpMyAdmin tutorial (you can find alternative directions for using MySQL on a command line here).
Now we're ready to do the really cool stuff! First, we'll use the smaller FoodData table to fetch some data about food nutrition info found from the USDA nutrition database. Each food listed has data for a 100g serving, which makes it useful to compare nutrition information between different foods.
The following table (called a schema) shows the different types of data (as columns) in the FoodData table for each food item (row), where protein and carbs are given in grams:
id | description | protein | carb |
---|---|---|---|
9427 | Abiyuch, raw | 1.5 | 17.6 |
9002 | Acerola juice, raw | 0.40 | 4.80 |
... | ... | ... | ... |
For our first query, let's first see what foods we have available.
What query would lists all of the food descriptions in the FoodData table?
Solution:
SELECT description
FROM FoodData;
You should see a 2371 rows returned after executing this command in your MySQL console.
Write a query that returns all of the food descriptions, grams of protein, and grams of carb for FoodData items that have more grams protein than carbs per 100g (remember that all data is given for 100g in this dataset). Your result should be in whatever order as defined by the table (no explicit ordering).
+-----------------------+---------+------+
| description | protein | carb |
+-----------------------+---------+------+
| Butter, salted | 0.85 | 0.06 |
| Butter oil, anhydrous | 0.28 | 0.00 |
| Cheese, blue | 21.40 | 2.34 |
| Cheese, brick | 23.24 | 2.79 |
| Cheese, brie | 20.75 | 0.45 |
| ... | ... | ... |
+-----------------------+---------+------+
191 rows in set (0.01 sec)
One solution is provided below:
SELECT description, protein, carb
FROM FoodData
WHERE protein > carb;
Write a query to return the descriptions, grams of protein, and grams of carbohydrates for foods that have more grams of protein than carbohydrates and which have the word 'cooked' in their description. Order the results by alphabetical order.
+----------------------------------------------------------+---------+-------+
| description | protein | carb |
+----------------------------------------------------------+---------+-------+
| Asparagus, frozen, cooked, boiled, drained, with salt | 2.95 | 1.92 |
| Asparagus, frozen, cooked, boiled, drained, without salt | 2.95 | 1.92 |
| Bamboo shoots, cooked, boiled, drained, with salt | 1.53 | 1.52 |
| Broccoli raab, cooked | 3.83 | 3.12 |
| Cowpeas, leafy tips, cooked, boiled, drained, with salt | 4.67 | 2.80 |
| ... | ... | ... |
+----------------------------------------------------------+---------+-------+
18 rows in set (0.01 sec)
One solution is provided below:
SELECT description, protein, carb
FROM FoodData
WHERE protein > carb AND description LIKE '%cooked%'
ORDER BY description;
Next, modify your answer from Query 3 to also include a new column containing the grams of protein rounded to the nearest whole integer. Omit the protein and carb columns in this query. You can use an alias to add a new named column to the result table.
For example, to create an alias for a new column named my_alias for adding protein and carb grams per record, we could do:
SELECT (protein + carb) as my_alias FROM FoodData;
You can use the
ROUND(x, y)
function in MySQL which truncates x to have exacty y decimal places.
In this case, round to 0 decimal places. Refer to the slide below for the expected output results.
+----------------------------------------------------------+-----------------+
| description | percent_protein |
+----------------------------------------------------------+-----------------+
| Asparagus, frozen, cooked, boiled, drained, with salt | 3 |
| Asparagus, frozen, cooked, boiled, drained, without salt | 3 |
| Bamboo shoots, cooked, boiled, drained, with salt | 2 |
| Broccoli raab, cooked | 4 |
| Cowpeas, leafy tips, cooked, boiled, drained, with salt | 5 |
| ... | ... |
+----------------------------------------------------------+-----------------+
12 rows in set (0.01 sec)
One solution is provided below:
SELECT description, ROUND(protein) as percent_protein
FROM FoodData
WHERE protein > carb AND description LIKE '%cooked%'
ORDER BY description;
Write a query to return the descriptions of foods that have the word 'apple' in them. Order the results in descending order by description.
+-------------------------------------------------------------------------+
| description |
+-------------------------------------------------------------------------+
| Sugar-apples, (sweetsop), raw |
| Ruby Red grapefruit juice blend (grapefruit, grape, apple), OCEAN SPRAY |
| Pineapple, raw, traditional varieties |
| Pineapple, raw, extra sweet variety |
| Pineapple, raw, all varieties |
| ... |
+-------------------------------------------------------------------------+
75 rows in set (0.00 sec)
One solution is provided below:
SELECT description
FROM FoodData
WHERE description LIKE '%apple%'
ORDER BY description DESC;
In this part of the section, we will use the video game dataset provided (identical to the one introduced in Friday's lecture). This dataset has a table called Games which we will write queries with. There are approximately 3000 rows of video game data in this table.
id | name | platform | release_year | genre | publisher | developer | rating |
---|---|---|---|---|---|---|---|
1 | Pokemon Red/Blue | GB | 1996 | Role-Playing | Nintendo | Game Freak | E |
2 | Spyro Reignited Trilogy | PS4 | 2018 | Platform | Activision | Toys for Bob | E |
3 | Universal Paperclips | PC | 2017 | World Domination | Frank Lantz | Frank Lantz | E |
... | ... | ... | ... | ... | ... | ... | ... |
Write a SQL query that returns the names of all games that were developed by Nintendo (your result should not have duplicates).
+---------------------------+
| name |
+---------------------------+
| Mario Kart Wii |
| New Super Mario Bros. |
| New Super Mario Bros. Wii |
| Mario Kart DS |
| Wii Fit |
| ... |
+---------------------------+
63 rows in set (0.01 sec)
One solution is provided below:
SELECT DISTINCT name
FROM Games
WHERE developer = 'Nintendo';
Write a SQL query that returns a list of all distinct game platforms in the Games table (e.g., Wii).
+----------+
| platform |
+----------+
| GB |
| PS4 |
| PC |
| NES |
| PS |
| Wii |
| ... |
+----------+
26 rows in set (0.01 sec)
One solution is provided below:
SELECT DISTINCT platform
FROM Games;
Write a SQL query that returns the names and release year of 20 of the video games released earliest as determined by their release_year in the Games table.
+-----------------------------+--------------+
| name | release_year |
+-----------------------------+--------------+
| Combat | 1977 |
| Space Invaders | 1978 |
| Defender | 1980 |
| Asteroids | 1980 |
| Missile Command | 1980 |
| ... | ... |
+-----------------------------+--------------+
20 rows in set (0.01 sec)
One solution is provided below:
SELECT name, release_year
FROM Games
ORDER BY release_year
LIMIT 20;
Write a SQL query that returns the name, platform and release year of all games with the word 'Spyro' in their title and which don't include 'Skylanders' in their title. Hint: Use NOT to negate a boolean expression in MySQL.
+----------------------------+----------+--------------+
| name | platform | release_year |
+----------------------------+----------+--------------+
| Spyro Reignited Trilogy | PS4 | 2018 |
| Spyro the Dragon | PS | 1998 |
| Spyro: Year of the Dragon | PS | 2000 |
| Spyro 2: Ripto's Rage | PS | 1999 |
| Spyro: Season of Ice | GBA | 2001 |
| ... | ... | ... |
+----------------------------+----------+--------------+
8 rows in set (0.01 sec)
One solution is provided below:
SELECT name, platform, release_year
FROM Games
WHERE name LIKE '%Spyro%'
AND NOT (name LIKE '%Skylanders%');
Note: This is a challenge query! We'll review ways to approach the solution in upcoming lectures, but feel free to see how far you can get using what we've covered (and the cheatsheet)!
Write a SQL query that returns the name and release_year of the Puzzle games released in the earliest year for Puzzle games in the Games table. How does the result for Puzzle games compare to Sports games?
+-----------------------------+--------------+
| name | release_year |
+-----------------------------+--------------+
| Ms. Pac-Man | 1981 |
+-----------------------------+--------------+
1 rows in set (0.02 sec)
One solution is provided below (you can nest SELECT to return specific values like MIN):
SELECT g1.name, g1.release_year
FROM Games g1
WHERE g1.genre = 'puzzle'
AND g1.release_year = (SELECT MIN(g2.release_year)
FROM Games g2
WHERE g2.genre = 'puzzle');
Write a SQL query that returns the average release_year of games in the Games table. Use the ROUND function to round the result to the nearest integer and rename the column with an alias avg_release_year.
+------------------+
| avg_release_year |
+------------------+
| 2005 |
+------------------+
1 row in set (0.01 sec)
One solution is provided below:
SELECT ROUND(AVG(release_year)) as avg_release_year
FROM Games;
Next, we will use the CSE 154 Query Tester tool to practice writing queries for a few databases. The query tester is located here.
world
Database (by Sylvia Tashev, Stefanie Hatcher)The first database we will practice writing SQL queries with includes tables containing information about countries, cities, and languages (note that this database isn't up to date, but still is useful for practicing queries). The schema for the three tables are on the next three slides below.
code | name | continent | population | gnp | ... |
---|---|---|---|---|---|
AFG | Afghanistan | Asia | 22720000 | 5976.0 | ... |
NLD | Netherlands | Europe | 15864000 | 371362.0 | ... |
... | ... | ... | ... | ... | ... |
id | name | country_code | district | population |
---|---|---|---|---|
3793 | New York | USA | New York | 8008278 |
1 | Los Angeles | USA | California | 3694820 |
... | ... | ... | ... | ... |
country_code | language | official | percentage |
---|---|---|---|
AFG | Pashto | T | 52.4 |
NLD | Dutch | T | 95.6 |
... | ... | ... | ... |
List all of the official languages spoken around the world in reverse alphabetical order.
+------------+
| language |
+------------+
| Zulu |
| Xhosa |
| Wolof |
| Vietnamese |
| Uzbek |
| ... |
+------------+
102 rows in set (0.02 sec)
One solution is provided below:
SELECT DISTINCT language
FROM languages
WHERE official='T'
ORDER BY language DESC;
What countries have the word Republic
as part of their name?
+---------------------------------------+
| name |
+---------------------------------------+
| Dominican Republic |
| Central African Republic |
| Congo, The Democratic Republic of the |
| Czech Republic |
+---------------------------------------+
4 rows in set (0.00 sec)
One solution is provided below:
SELECT name
FROM countries
WHERE name LIKE '%Republic%';
List the names of all countries that are not in Africa.
+--------------------------+
| name |
+--------------------------+
| Afghanistan |
| Netherlands |
| Netherlands Antilles |
| Albania |
| American Samoa |
| ... |
+--------------------------+
181 rows in set (0.00 sec)
Two solutions are provided below. Note that the first uses !=
and the second
uses <>
for the inequality operators.
SELECT name
FROM countries
WHERE continent != 'Africa';
SELECT name
FROM countries
WHERE continent <> 'Africa';
Which countries have a GNP between $10,000 AND $50,000? Write a query that returns the name and gnp of these countries.
+------------------------+----------+
| name | gnp |
+------------------------+----------+
| Algeria | 49982.00 |
| United Arab Emirates | 37966.00 |
| Bangladesh | 32852.00 |
| Brunei | 11705.00 |
| Bulgaria | 12178.00 |
...
+------------------------+----------+
36 rows in set (0.00 sec)
SELECT name, gnp
FROM countries
WHERE gnp BETWEEN 10000 AND 50000;
You can also use:
SELECT name, gnp
FROM countries
WHERE gnp < 50000
AND gnp > 10000;
Which countries have the densest population (population divided by surface area)? Write a query that returns the name, population, surface_area, and population_density of all countries, ordered by population densities (descending).
+------------+------------+--------------+--------------------+
| name | population | surface_area | population_density |
+------------+------------+--------------+--------------------+
| Macao | 473000 | 18.00 | 26277.777778 |
| Monaco | 34000 | 1.50 | 22666.666667 |
| Hong Kong | 6782000 | 1075.00 | 6308.837209 |
| Singapore | 3567000 | 618.00 | 5771.844660 |
| Gibraltar | 25000 | 6.00 | 4166.666667 |
| ... | ... | ... | ... |
+------------+------------+--------------+--------------------+
239 rows in set (0.00 sec)
SELECT name, population, surface_area,
population / surface_area as population_density
FROM countries
ORDER BY population_density DESC;
List the country_codes for all countries where English is spoken as an official language.
+--------------+
| country_code |
+--------------+
| AIA |
| ASM |
| ATG |
| BLZ |
| BMU |
| ... |
+--------------+
44 rows in set (0.02 sec)
SELECT country_code
FROM languages
WHERE language = 'English' AND official = 'T';