TCLUG Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [TCLUG:19152] LOAD DATA INFILE



On Sat, 24 Jun 2000, George M Swan III wrote:

> > database or a wide, un-normalized database.
> Yes I suppose I should have, 
>  ...but I didna know no better...... yet. :-)
> 
> Seriously, chewie, what's 'normalized' and 'un-normalized mean'?
> (I'm headed for the 6C to B&N, so I'll pick up on this later.)

A good quick book to look at for this is the O'Reilly MySQL & mSQL
book, chapter 2. Sadly, the rest of the book is worthless. If you are
going to use MySQL and need a good reference book (but nothing on
normalization), by the DuBois book MySQL. It is excellent.

Now, regarding normalization -- this is a fancy database way of saying
that you don't repeat your data. To use a rough metaphore, it's like a
function in one of your programs. Basically, you split off information
which is not unique to a single record into its own table, then refer to
it in the original table by a reference (in MySQL, you must use a number.
Other more advanced databases have what's known as "foriegn keys" to help
you out.) 

For example, let's say you were making a database containing recipies. One
way to do it would be roughly this (this is MySQL code since it's all I
know right now):

create table recipies (
	recipie_id	INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	name		VARCHAR(255) NOT NULL,
	ingredients	TEXT NOT NULL,
	preparation	TEXT NOT NULL
);

This will work fine. But you're likely to repeat "flour" a lot, aren't
you? And at least some of the time, you're probably going to spell it
wrong, or forget it or something. That's lame. So, if we sperate out the
ingredients...

create table ingredient (
	ingredient_id	INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	ingredient_name	VARCHAR(255)
);

...then change the main table...

create table recipies (
	recipie_id	INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	name		VARCHAR(255),
	preparation	TEXT NOT NULL
);

...we can then create a new table which contains both the recipie_id AND
the ingredient_id to list all the ingredients which are in a single
recipie:

create table ingredient_in_recipie )
	id	INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	ingredient_id	INT NOT NULL,
	recipie_id	INT NOT NULL,

	UNIQUE(ingredient_id, recipie_id)
);

Now, I can get a list of every ingredient in a recipie -- but I can also
get a list of every recipie which has an ingredient in it! That's the
power of normalization: you get more flexiblity with your data. It may
seem like overkill in this example, but it could be done.

Hope this answered some questions instead of horribly confusing you...

Luke