HW1: SQL Basics

There are 4 problems, with some subproblems. The first two are about SQL and SQLite. Then there is one theory problem and one Java programming problem.

Problem 4 depends on Problem 1, so you should solve (most of) Problem 1 before attempting Problem 4. Other than that, you can work on the four problems in any order you like.

Turn the first 3 problems (i.e., all answers to SQL problems plus the theory problem) as a single PDF to the "HW1 Written" assignment on Gradescope. Turn in your hw1.java file from problem 4 to "HW1 Coding" on Gradescope.

  1. (SQL) This problem gives you practice with the basics of SQL.

    1. Write a SQL statement that creates a table edges with two columns source and destination that are both integers.

    2. Write a single SQL statement that adds these four tuples to edges: (11, 6), (2, 26), (2, 4), and (5, 5).

    3. Write a SQL query that returns all tuples in edges. What order do the output rows appear to be printed in?

    4. Write a SQL query that returns only the source column for each tuple in edges.

    5. Write a SQL query that returns only the source column for each tuple in edges. Do not include duplicate elements.

    6. Write a SQL query that returns, for each tuple in edges, the value of source plus the value of destination.

    7. Write a SQL query that returns all tuples of edges where source is less than destination. Sort the output by destination.

    8. The next few subproblems explore SQLite's dynamic typing, which is very atypical compared to other RDBMSes. You may find the documentation on datatypes in SQLite useful.

      Write a SQL statement to try to insert the tuple ('17', 'hello') into edges.

    9. According to the relational model, your statement should cause an error, because neither field of the to-be-inserted tuple matches the correct type. What happens in SQLite instead?

    10. Read the documentation for the built-in typeof function in SQLite. Then, run this query:

      SELECT source, destination, typeof(source), typeof(destination) 
      FROM edges
      

      Paste the output of that query into your answers. Then explain the output (especially the output that corresponds the tuple you inserted in the previous part of this problem).

  2. (SQL) This problem uses booleans and dates, which are not actual datatypes in SQLite (though they are in most other RDBMSes). Instead, booleans are represented in SQLite as integers, and dates as strings (i.e., with type text).

    Here is a quick tour of these features. See the SQLite documentation on date and time functions for more information on dates.

    • 0 (false) and 1 (true) are the values used to interpret booleans.
    • Date strings in SQLite are in the form: 'YYYY-MM-DD'.

      Examples of valid date strings include: '1991-02-01', '0000-12-31', and '2025-03-28'.

      Examples of invalid date strings include: '25-11-01', '1900-1-20', '2025-03-5', and '2025-03-50'.

      Examples of date operations on date strings (feel free to try them):

      • SELECT date('2025-03-28')
      • SELECT date('now')
      • SELECT date('now', '-5 year')
      • SELECT date('now', '-5 year', '+24 hour')

    What does the following query do? (no need to turn in your answer)

    SELECT 
      CASE 
        WHEN date('now') < date('2025-03-15') THEN 'Taking classes' 
        WHEN date('now') < date('2025-03-22') THEN 'Exams' 
        ELSE 'Spring Break'
      END
    

    Now, to the actual subproblems.

    1. Write a SQL statement to create a table called my_restaurants with the following attributes (you can pick your own names for the attributes, just make sure it is clear which one is for which):

      • Name of the restaurant: a string
      • Type of food they make: a string
      • Distance (in minutes) from your house: an int
      • Date of your last visit: a string, interpreted as a date
      • Whether you like it or not: an int, interpreted as a boolean
    2. Write some number of INSERT statements to add at least five restaurants to the table, subject to the following requirements:

      • you like at least one restaurant
      • you dislike at least one restaurant
      • at least one restaurant has NULL in the field indicating whether you like it or not
    3. Write a SQL query that returns all attributes of all restaurants that you like, but have not visited for more than 3 months (not including exactly 3 months). Your query must work correctly today and tomorrow. (In other words, do not hard-code the date of "3 months ago", which will be different today versus tomorrow. Instead, use the date() function.)

  3. (Theory / practice remembering CSE 311). We will write \([\,]\) for the empty list and \(z :: L\) (pronounced "\(z\) consed onto \(L\)") for the list consisting of the element \(z\) followed by all the elements of \(L\).

    For example, if \(x = 17\) and \(L = [1, 2, 3]\), then \(x :: L = [17, 1, 2, 3]\).

    Here is a recursive definition of \(\mathrm{concat}(L_1, L_2)\) that concatenates (i.e., appends) two lists.

    \[ \begin{align} \mathrm{concat}([\,],\, L_2) &= L_2\\ \mathrm{concat}(x :: L_1,\, L_2) &= x :: \mathrm{concat}(L_1, L_2)\\ \end{align} \]

    And here is a definition of a recursive function \(\mathrm{count}(x, L)\), where \(x\) is an integer and \(L\) is a list of integers, which counts the number of occurrences of \(x\) in \(L\).

    \[ \begin{align} \mathrm{count}(x,\, [\,]) &= 0\\ \mathrm{count}(x,\, y::L) &= 1 + \mathrm{count}(x, L) & \text{if } x = y \\ \mathrm{count}(x,\, y::L) &= \mathrm{count}(x, L) & \text{if } x \ne y \end{align} \]

    Now let \(x\) be any integer and let \(L_1\) and \(L_2\) be any lists of integers.

    Prove by structural induction on \(L_1\) that

    \[ \mathrm{count}(x,\, \mathrm{concat}(L_1, L_2)) = \mathrm{count}(x,\, L_1) + \mathrm{count}(x,\, L_2) \]

    We are not nearly as picky as CSE 311 about proofs, and especially not about proof formatting. You may use any proof format you like, as long as it is by induction on \(L_1\).

    We do care that your proof makes sense, though, and especially that it makes sense to you.

    Please ask questions if anything in this problem is unclear, as different people took CSE 311 from different instructors, and some people may have taken it a long time ago.

  4. (Java) In the provided starter code we have provided a class Edge to represent a single row in the edges table from Problem 1. For this problem, we will represent relations in Java as lists of rows. So the edges relation will be represented as a List<Edge>.

    Implement the stub method sourcesNoDups with the same specification as Problem 1.e.

    Implement the stub method edgesWithSmallerSource with the same specification as Problem 1.g.

    We have provided a main method to test your code with the sample data from Problem 1.b. See the bottom of the starter code for the expected output.

    The purpose of this problem is to get you thinking about the connection between SQL queries and their "for-each" semantics. You will not be graded harshly on programming style or runtime performance. Please write as simple code as possible. You may any Java features you like. Your code must work with the version of Java that runs on attu (Java 21).

    For reference, the staff solution adds 15 lines of code total to the starter code. If you find yourself writing significantly more than that (for example, more than 100 lines of new code), you may be on the wrong track (though going a bit over 15 is totally reasonable if it makes sense to you).