|
1
|
|
|
2
|
- Project: questions?
- How to model data (E/R modeling)
- How to design a good schema (normalization).
|
|
3
|
- Requirements modeling (conceptual, pictures)
- Decide what entities should be part of the application and how they
should be linked.
- Schema design and implementation
- Decide on a set of tables, attributes.
- Define the tables in the database system.
- Populate database (insert tuples).
- Write application programs using the DBMS
- way easier now that the data management is taken care of.
|
|
4
|
- Why do we need it?
- Agree on structure of the
database before deciding on a particular implementation.
- Consider issues such as:
- What entities to model
- How entities are related
- What constraints exist in the domain
- How to achieve good designs
|
|
5
|
- 1. Object Definition Language (ODL):
- Closer in spirit to
object-oriented models
- I don’t teach it anymore.
- 2. Entity/Relationship model (E/R):
- More relational in nature.
- Both can be translated (semi-automatically) to relational schemas
- ODL to OO-schema: direct transformation (C++ or Smalltalk based system).
|
|
6
|
|
|
7
|
- Every entity set must have a key
|
|
8
|
|
|
9
|
- A mathematical definition:
- if A, B are sets, then a relation R is a subset of A x B
- A={1,2,3}, B={a,b,c,d},
- R = {(1,a), (1,c), (3,b)}
- - makes is a subset of Product x Company:
|
|
10
|
- one-one:
- many-one
- many-many
|
|
11
|
|
|
12
|
|
|
13
|
- Q: what does the arrow mean ?
- A: if I know the store, person, invoice, I know the movie too
|
|
14
|
- Q: what do these arrow mean ?
- A: store, person, invoice determines movie and store, invoice, movie determines
person
|
|
15
|
- Q: how do I say: “invoice determines store” ?
- A: no good way; best approximation:
- Q: Why is this incomplete ?
|
|
16
|
|
|
17
|
|
|
18
|
|
|
19
|
- Entity set à
relation
- Relationship à
relation
|
|
20
|
|
|
21
|
|
|
22
|
|
|
23
|
|
|
24
|
|
|
25
|
|
|
26
|
|
|
27
|
|
|
28
|
|
|
29
|
- Think in terms of records:
- Product
- SoftwareProduct
- EducationalProduct
|
|
30
|
|
|
31
|
|
|
32
|
- Say: each piece of furniture is owned either by a person, or by a
company
- Solution 1. Acceptable, imperfect (What’s wrong ?)
- Company
|
|
33
|
- Solution 2: better, more laborious
- Company
|
|
34
|
|
|
35
|
|
|
36
|
|
|
37
|
|
|
38
|
|
|
39
|
|
|
40
|
|
|
41
|
|
|
42
|
|
|
43
|
- Definition: A1, ...,
Am à
B1, ..., Bn holds in R if:
- "t, t’ Î R, (t.A1=t’.A1
Ù ...
Ù t.Am=t’.Am
Þ t.B1=t’.B1
Ù ...
Ù t.Bm=t’.Bm
)
|
|
44
|
- Functional dependencies are part of the schema!
- They constrain the possible legal data instances.
- At any point in time, the actual database may satisfy additional FD’s.
|
|
45
|
- EmpID Name, Phone,
Position
- Position Phone
- but Phone Position
|
|
46
|
- A key is a set of attributes A1, ..., An s.t. for
any other attribute B, A1, ..., An à B
- A minimal key is a set of attributes which is a key and for which no
subset is a key
- Note: book calls them superkey and key
|
|
47
|
- Product(name, price, category, color)
- name, category à
price
- category à
color
- Keys are: {name, category} and
all supersets
- Enrollment(student, address, course, room, time)
- student à
address
- room, time à
course
- student, course à room, time
- Keys are: [in class]
|
|
48
|
|
|
49
|
|
|
50
|
|
|
51
|
|
|
52
|
- More rules in the book – please read !
|
|
53
|
- Main idea:
- Start with some relational schema
- Find out its FD’s
- Important also to look at inferred FD’s.
- Use them to design a better relational schema
|
|
54
|
|
|
55
|
|
|
56
|
|
|
57
|
|
|
58
|
- Enrollment(student, major, course, room, time)
- student à
major
- major, course à
room
- course à time
- What else can we infer ? [in class]
|
|
59
|
|
|
60
|
|
|
61
|
|
|
62
|
- Show the following by induction:
- Initially X = {A1, …, An} --
holds
- Induction step: B1, …, Bm in X
- Implies A1, …, An B1, …, Bm
- We also have B1, …, Bm C
- By transitivity we have A1, …, An
C
- This shows that the algorithm is sound; need to show it is complete
|
|
63
|
- Main idea:
- Start with some relational schema
- Find out its FD’s
- Use them to design a better relational schema
|
|
64
|
|
|
65
|
|
|
66
|
|
|
67
|
|
|
68
|
- Sometimes it is incorrect:
|
|
69
|
|
|
70
|
- First Normal Form = all attributes are atomic
- Second Normal Form (2NF) = old and obsolete
- Third Normal Form (3NF) = this lecture
- Boyce Codd Normal Form (BCNF) = this lecture
- Others...
|
|
71
|
|
|
72
|
|
|
73
|
|
|
74
|
|
|
75
|
|
|
76
|
- R(A,B,C,D) A B, B C
- Key: A, D
- Violations of BCNF: A B,
A C, A
BC
- Pick A BC:
split into R1(A,BC) R2(A,D)
- What happens if we pick A B first ?
|
|
77
|
- A decomposition is lossless if we
can recover:
- R(A,B,C)
- R1(A,B) R2(A,C)
- R’(A,B,C) should be the same as
R(A,B,C)
|
|
78
|
- Given R(A,B,C) s.t. AàB, the decomposition into R1(A,B), R2(A,C) is
lossless
|
|
79
|
|
|
80
|
|
|
81
|
|