Take a look at this Student Class Table
Q: is this a good table design?

- There are multiple duplicates
Redundancies in StudentClass Table
- The same information is included multiple times
- Redundancy leads to potential "anomalies" down the road
- Update anomaly: Information may be updated partially and inconsistently
- Q: What if a student changes the address?
- Insertion anomaly: We may not include some information at all
- What if a student does not take any clas?
- Deletion anomaly: While deleting some information, we may delete others
- What if the only class that a student takes get cancelled?
Coming up with Better Tables
Q: Any way to arrive at the better design more systematically?
Q: Where is the redundancy from?
Can you guess the ? part of the table?

sid → name, addr
dept, cnum → title, unit
Intuition behind Normalization Theory
- Functional Dependency (FD)
- Some attributes are "determined" by other attributes:
- e.g. sid → (name, addr), (dept, cnum) → title, unit
- When there is a functional dependency we may have redundancy
- e.g. (105, Elaine, 84 East) is stored redundantly, so is (CS, 143, database, 04)
"Decomposing" StudentClass Table
- StudentClass(sid, name, addr, dept, cnum, title, unit)
- FD: sid → (name, addr), (dept, cnum) → title, unit
- A(sid, name, addr)
- C(dept, cnum, title, unit)
- D(sid, dept, cnum)
- Basic idea of "normalization"
- Whenever there is FD, the table may be "bad" due to redundancy
- We uses FDs to split (or decompose) table and remove the redundanct
Functional Dependency (FD)
If 2 tuples have the same X values, their Y values are the same