Notes
Slide Show
Outline
1
Lecture 4: Updates, Views, Constraints and Other Fun Features
  • Monday, April 19th, 2004
2
Agenda
  • Nulls and outerjoins
  • Creating and updating schemas
  • Views: updating and reusing them
  • Constraints
  • Programming with SQL
  • Relational algebra
3
Null Values and Outerjoins
  • Explicit joins in SQL:
  • Product(name, category)
  •       Purchase(prodName, store)






  • Same as:





  • But Products that never sold will be lost !


4
Null Values and Outerjoins
  • Left outer joins in SQL:
  • Product(name, category)
  •     Purchase(prodName, store)


5
 
6
Outer Joins

  • Left outer join:
    • Include the left tuple even if there’s no match
  • Right outer join:
    • Include the right tuple even if there’s no match
  • Full outer join:
    • Include the both left and right tuples even if there’s no match

7
Modifying the Database
  • Three kinds of modifications
  • Insertions
  • Deletions
  • Updates



  • Sometimes they are all called “updates”
8
Insertions
9
Insertions
10
Insertion: an Example
11
Insertion: an Example
12
Insertion: an Example
13
Deletions
14
Updates
15
Data Definition in SQL
16
Data Types in SQL
  • Characters:
    • CHAR(20) -- fixed length
    • VARCHAR(40) -- variable length
  • Numbers:
    • INT, REAL plus variations
  • Times and dates:
    • DATE, DATETIME (SQL Server only)
  • To reuse domains:
    CREATE DOMAIN  address  AS  VARCHAR(55)


17
Creating Tables
18
Deleting  or Modifying a Table
19
Default Values
20
Indexes
21
Indexes
  • Create an index on name:







  • B+ trees have fan-out of 100s: max 4 levels !
22
Creating Indexes
23
Creating Indexes
24
Creating Indexes
25
Agenda
  • Nulls and outerjoins
  • Creating and updating schemas
  • Views: updating and reusing them
  • Constraints
  • Programming with SQL
  • Relational algebra
26
Defining Views
27
A Different View
28
Using a View
29
What Happens When We Query a View ?
  •  SELECT   name, Seattle-view.store
  •  FROM      Seattle-view, Product
  •  WHERE   Seattle-view.product = Product.name  AND
  •                   Product.category = “shoes”
30
Types of Views
  • Virtual views:
    • Used in databases
    • Computed only on-demand – slower at runtime
    • Always up to date
  • Materialized views
    • Used in data warehouses
    • Precomputed offline – faster at runtime
    • May have stale data
31
Updating Views
32
Non-Updatable Views
33
Answering Queries Using Views
  • What if we want to use a set of views to answer a query.
  • Why?
    • The obvious reason…
    • Answering queries over web data sources.
  • Very cool stuff! (i.e., I did a lot of research on this).
34
Reusing a Materialized View
  • Suppose I have only the result of SeattleView:
  •          SELECT  buyer, seller, product, store
  •          FROM     Person, Purchase
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer
  • and I want to answer the query
  •          SELECT  buyer, seller
  •          FROM     Person, Purchase
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer AND
  •                           Purchase.product=‘gizmo’.
  • Then, I can rewrite the query using the view.
35
Query Rewriting Using Views
  • Rewritten query:
  •          SELECT  buyer, seller
  •          FROM      SeattleView
  •          WHERE   product= ‘gizmo’


  • Original query:
  •          SELECT  buyer, seller
  •          FROM     Person, Purchase
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer AND
  •                           Purchase.product=‘gizmo’.
36
Another Example
  • I still have only the result of SeattleView:
  •          SELECT  buyer, seller, product, store
  •          FROM     Person, Purchase
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer
  • but I want to answer the query
  •          SELECT  buyer, seller
  •          FROM     Person, Purchase
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer AND
  •                           Person.Phone  LIKE ‘206 543 %’.
37
And Now?
  • I still have only the result of SeattleView:
  •          SELECT  buyer, seller, product, store
  •          FROM     Person, Purchase, Product
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer AND
  •                           Purchase.product = Product.name
  • but I want to answer the query
  •          SELECT  buyer, seller
  •          FROM     Person, Purchase
  •          WHERE   Person.city = ‘Seattle’    AND
  •                           Person.per-name = Purchase.buyer.
38
And Now?
  • I still have only the result of:
  •          SELECT  seller, buyer, Sum(Price)
  •          FROM     Purchase
  •          WHERE   Purchase.store = ‘The Bon’
  •          Group By seller, buyer
  • but I want to answer the query
  •          SELECT  seller, Sum(Price)
  •          FROM     Purchase
  •          WHERE   Person.store = ‘The Bon’
  •          Group By seller


  • And what if it’s the other way around?
39
Finally…
  • I still have only the result of:
  •          SELECT  seller, buyer, Count(*)
  •          FROM     Purchase
  •          WHERE   Purchase.store = ‘The Bon’
  •          Group By seller, buyer
  • but I want to answer the query
  •          SELECT  seller, Count(*)
  •          FROM     Purchase
  •          WHERE   Person.store = ‘The Bon’
  •          Group By seller
40
The General Problem
  • Given a set of views V1,…,Vn, and a query Q, can we answer Q using only the answers to V1,…,Vn?
  • Why do we care?
    • We can answer queries more efficiently.
    • We can query data sources on the WWW in a principled manner.
  • Many, many papers on this problem.
  • The best performing algorithm: The MiniCon Algorithm, (Pottinger & (Ha)Levy, 2000).
41
Querying the WWW
  • Assume a virtual schema of the WWW, e.g.,
    • Course(number, university, title, prof, quarter)
  • Every data source on the web contains the answer to a view over the virtual schema:
  • UW database: SELECT  number, title, prof
  •                        FROM     Course
  •                        WHERE   univ=‘UW’ AND quarter=‘2/02’
  • Stanford database: SELECT  number, title, prof, quarter
  •                               FROM     Course
  •                               WHERE   univ=‘Stanford’
  • User query: find all professors who teach “database systems”
42
Agenda
  • Nulls and outer-joins
  • Creating and updating schemas
  • Views: updating and reusing them
  • Constraints
  • Programming with SQL
  • Relational algebra
43
Constraints in SQL
  • A constraint = a property that we’d like our database to hold
  • The system will enforce the constraint by taking some actions:
    • forbid an update
    • or perform compensating updates
44
Constraints in SQL
  • Constraints in SQL:
  • Keys, foreign keys
  • Attribute-level constraints
  • Tuple-level constraints
  • Global constraints: assertions


  • The more complex the constraint, the harder it is to check and to enforce
45
Keys
  • OR:
46
Keys with Multiple Attributes
  • CREATE TABLE Product (
  • name CHAR(30),
  • category VARCHAR(20),
  • price INT,
  •     PRIMARY KEY (name, category))
47
Other Keys
  • CREATE TABLE Product (
  •   productID  CHAR(10),
  • name CHAR(30),
  • category VARCHAR(20),
  • price INT,
        PRIMARY KEY (productID),
  •     UNIQUE (name, category))
48
Foreign Key Constraints
  • CREATE TABLE Purchase (
  • prodName CHAR(30)
  • REFERENCES Product(name),
  •     date DATETIME)
49
 
50
Foreign Key Constraints
  • OR






  • (name, category) must be a PRIMARY KEY
51
What happens during updates ?
  • Types of updates:
  • In Purchase: insert/update
  • In Product: delete/update
52
What happens during updates ?
  • SQL has three policies for maintaining referential integrity:
  • Reject violating modifications (default)
  • Cascade: after a delete/update do a delete/update
  • Set-null set foreign-key field to NULL


  • READING ASSIGNEMNT: 7.1.5, 7.1.6
53
Constraints on Attributes and Tuples
  • Constraints on attributes:
    NOT NULL -- obvious meaning...
    CHECK condition -- any condition !
  • Constraints on tuples
    CHECK condition
54
 
55
General Assertions
56
Final Comments on Constraints
  • Can give them names, and alter later
    • Read in the book.
  • We need to understand exactly when they are checked
  • We need to understand exactly what actions are taken if they fail
57
Triggers
58
Elements of Triggers (in SQL3)
59
Example: Row Level Trigger
60
Statement Level Trigger
61
Bad Things Can Happen
62
Agenda
  • Nulls and outerjoins
  • Creating and updating schemas
  • Views: updating and reusing them
  • Constraints
  • Programming with SQL
  • Relational algebra
63
Embedded SQL
  • direct SQL (= ad-hoc SQL) is rarely used
  • in practice: SQL is embedded in some application code
  • SQL code is identified by special syntax
64
Impedance Mismatch
  • Example: SQL in C:
    • C uses int, char[..], pointers, etc
    • SQL uses tables
  • Impedance mismatch = incompatible types
65
The Impedance Mismatch Problem
  • Why not use only one language?


  • Forgetting SQL: “we can quickly dispense with this idea”   [textbook, pg. 351].


  • SQL cannot do everything that the host language can do.


  • Solution: use cursors
66
Programs with Embedded SQL
67
Interface: SQL / Host Language
68
Example
  • Product (pname,  price, quantity, maker)
  • Purchase (buyer,  seller,  store,  pname)
  • Company (cname, city)
  • Person(name, phone, city)


69
Using Shared Variables
70
Single-Row Select Statements
71
Cursors
  • Declare the cursor
  • Open the cursor
  • Fetch tuples one by one
  • Close the cursor
72
Cursors
  • void product2XML() {
  •  EXEC SQL BEGIN DECLARE SECTION;
  •               char n[20], c[30];
  •               int p, q;
  •               char    SQLSTATE[6];
  •  EXEC SQL END DECLARE SECTION;
  • EXEC SQL DECLARE crs CURSOR FOR
  •   SELECT pname, price, quantity, maker
  • FROM Product;


  •  EXEC SQL OPEN crs;
73
Cursors
    • printf(“<allProducts>\n”);
    • while (1) {
    •      EXEC SQL FETCH FROM crs INTO :n, :p, :q, :c;
    •      if (NO_MORE_TUPLES) break;
    •      printf(“    <product>\n”);
    •      printf(“           <name>      %s    </name>\n”,     n);
    •      printf(“           <price>       %d   </price>\n”,      p);
    •      printf(“           <quantity>  %d   </quantity>\n”, q);
    •      printf(“           <maker>     %s   </maker>\n”,     c);
    •      printf(“    </product>\n”);
    • }
    • EXECT SQL CLOSE crs;
    • printf(“</allProducts>\n”);
  • }
74
"What is NO_MORE_TUPLES ?"
  • What is NO_MORE_TUPLES ?


  • #define NO_MORE_TUPLES !(strcmp(SQLSTATE,”02000”))
75
More on Cursors
76
Agenda
  • Nulls and outerjoins
  • Creating and updating schemas
  • Views: updating and reusing them
  • Constraints
  • Programming with SQL
  • Relational algebra
77
Relational Algebra
  • Formalism for creating new relations from existing ones
  • Its place in the big picture:
78
Relational Algebra
  • Five operators:
    • Union: È
    • Difference: -
    • Selection: s
    • Projection: P
    • Cartesian Product: ´
  • Derived or auxiliary operators:
    • Intersection, complement
    • Joins (natural,equi-join, theta join, semi-join)
    • Renaming: r
79
1. Union and 2. Difference
  • R1 È R2
  • Example:
    • ActiveEmployees È RetiredEmployees


  • R1 – R2
  • Example:
    • AllEmployees -- RetiredEmployees
80
What about Intersection ?
  • It is a derived operator
  • R1 Ç R2 = R1 – (R1 – R2)
  • Also expressed as a join (will see later)
  • Example
    • UnionizedEmployees Ç RetiredEmployees
81
3. Selection
  • Returns all tuples which satisfy a condition
  • Notation:  sc(R)
  • Examples
    •  sSalary > 40000 (Employee)
    •  sname = “Smithh” (Employee)
  • The condition c can be =, <, £, >, ³, <>
82
 
83
4. Projection
  • Eliminates columns, then removes duplicates
  • Notation:   P A1,…,An (R)
  • Example: project social-security number and names:
    •   P SSN, Name (Employee)
    • Output schema:   Answer(SSN, Name)



84
 
85
5. Cartesian Product
  • Each tuple in R1 with each tuple in R2
  • Notation: R1 ´ R2
  • Example:
    • Employee ´ Dependents
  • Very rare in practice; mainly used to express joins
86
 
87
Relational Algebra
  • Five operators:
    • Union: È
    • Difference: -
    • Selection: s
    • Projection: P
    • Cartesian Product: ´
  • Derived or auxiliary operators:
    • Intersection, complement
    • Joins (natural,equi-join, theta join, semi-join)
    • Renaming: r
88
Renaming
  • Changes the schema, not the instance
  • Notation: r B1,…,Bn (R)
  • Example:
    •  rLastName, SocSocNo (Employee)
    • Output schema:
      Answer(LastName, SocSocNo)
89
Renaming Example
90
Natural Join
  • Notation: R1 ⋈ R2


  • Meaning:  R1 ⋈ R2 = PA(sC(R1 ´ R2))


  • Where:
    • The selection sC checks equality of all common attributes
    • The projection eliminates the duplicate common attributes
91
 
92
Natural Join
  • R=                                S=




  • R ⋈ S=


93
Natural Join
  • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ?


  • Given R(A, B, C),  S(D, E), what is R ⋈ S  ?


  • Given R(A, B),  S(A, B),  what is  R ⋈ S  ?
94
Theta Join
  • A join that involves a predicate
  • R1 ⋈ q R2   =  s q (R1 ´ R2)
  • Here q can be any condition
95
Eq-join
  • A theta join where q is an equality
  • R1 ⋈A=B R2   =  s A=B (R1 ´ R2)
  • Example:
    • Employee ⋈SSN=SSN Dependents

  • Most useful join in practice
96
Semijoin
  • R ⋉ S  = P A1,…,An (R ⋈ S)
  • Where A1, …, An are the attributes in R
  • Example:
    • Employee ⋉ Dependents

97
Semijoins in Distributed Databases
  • Semijoins are used in distributed databases
98
Complex RA Expressions

  •      Person         Purchase          Person          Product
99
Operations on Bags
  • A bag = a set with repeated elements
  • All operations need to be defined carefully on bags
  • {a,b,b,c}È{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}
  • {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}
  •  sC(R): preserve the number of occurrences
  •  PA(R): no duplicate elimination
  • Cartesian product, join: no duplicate elimination
  • Important ! Relational Engines work on bags, not sets !
100
Finally: RA has Limitations !
  • Cannot compute “transitive closure”











  • Find all direct and indirect relatives of Fred
  • Cannot express in RA !!!  Need to write C program