The purpose of this document is to serve as a basic overview of database normalization. This example uses a hazarous waste disposal company as a client.
The example company has customers, who have waste generator locations, who produce waste streams.
Database Normalization:
Normalization of a database is to separate redundant information into separate related tables. so instead of storing:
Waste stream 1: John’s Exxon 123 Main, San Pedro, CA 99999, 310 555-1212, used oil
Waste stream 2: John’s Exxon 123 Main, San Pedro, CA 99999, 310 555-1212, antifreeze
Waste stream 3: John’s Exxon 123 Main, San Pedro, CA 99999, 310 555-1212, old tires
We store it as:
John’s Exxon 123 Main, San Pedro, 310 555-1212
Used oil
Antifreeze
Old tires
That way, we eliminate the redundant storage of the generator address and reduce the possibility of bad data entry. We could further normalize this in many ways, but you get the idea.
Customer Profiles:
Standard admin contact information name address, phone
Technical contact name address phone
Waste Generators:
Attached to customer by id number, so that customer #123 can have multiple locations. Technical contact for each location, but “same as” technical contact is accepted.
Waste Streams:
Attached to the waste generators by id number, so that customer #123’s locations can have x number of waste streams. Waste stream types could be held in a related table so that:
Waste stream type:
Attached to the waste streams by id so that for the billion or so gas stations, mechanics, auto parts stores etc., that recycle used oil, we don’t end up with data that looks like: OIL, Oil, Used Oil, Oil – Used, Oil-Dirty, Oil-dirty, etc.
By controlling the data entry with unique lists, we assure valid and complete search results.
Example of normalization with data:
Customer table:
id | Company | first | last | Address | phone |
1 | Johns Exxon | John | Doe | 123 Main | 310 555-2223 |
2 | Mikes Mobil | Mike | Marker | 456 6th St | 310 555-3453 |
(Two gas stations)
Generator: table
id | Customer.id | techfirst | techlast | Address | phone |
1 | 1 | Tim | Technical | 123 Main | 310 555-2233 |
2 | 1 | Tom | Technical | 321Hill St | 310 555-4455 |
3 | 2 | Ted | Technical | 456 6th St | 310 555-3444 |
(There are two locations for John’s Exxon and they have different technical contacts, one location for Mikes Mobil)
Waste stream type table:
Id | type |
1 | Oil |
2 | Antifreeze |
3 | Old tires |
By storing information this way, we can represent a lot of information in a small space.
Example of two waste stream records:
id | Customer.id | Generator.id | Waste_type.id |
1 | 1 | 2 | 1 |
2 | 2 | 1 | 1 |
What the above two lines really mean is:
Waste Steam record 1 | Waste stream record 2 |
Company: John’s Exxon
First name: John Last name: Doe Address: 123 Main Phone: 310 555-1212 Technical contact first name: Tim Technical contact last name: Technical Generator address: 123 Main Technical contact phone: 310 555-2233 Waste stream type: oil |
Company: Mike’s Mobil
First name: Mike Last name: Marker Address: 456 6th Phone: 310 555-3453 Technical contact first name: Ted Technical contact last name: Technical Generator address: 123 Main Technical contact phone: 310 555-3444 Waste stream type: oil |
Obviously this is wildly over-simplified, but the concept is there.