Click to See Complete Forum and Search --> : Database Experts Opinions please.


ketchupaholic
August 26th, 2009, 06:45 PM
Okay, at work I'm having a big disagreement with the project management team who has no database experience at all. Here is the issue.

We have 10 product lines, in each product line there are certain number of attribute names, such as height, weight, color, price and so on. We have a total of 670 attribute names for all 10 product lines. The project management team is wanting to create a master file in excel that contains all 670 attribute names, regardless if the attribute name pertains to the product line or not. So let's say 'Widget A' has only 70 attribute names, they want to still add the other 600 attribute names and enter a dash for the value (You can only import if the value has a character in the field). We have to import each product line one-by-one since that is what the software requires.

To me, this is not the proper way to build a database since we are creating a database that will have hundreds of thousands of null characters. They say this is normalizing the databse, and I told them it is he complete opposite of normalizing a database.

I look forward to your opinions as well.

Alsvha
August 27th, 2009, 03:04 AM
The "text-book" way of doing it would be to have a table (or spread sheet) of the products, have a table for the attributes, and then have a connection table between the two.
So you'll have something like a Product - AttributeOnProduct - Attribute. If you can follow my drift. It is one of the normalization rules for database design. I can't remember which.

The advantage of doing it the above way is that it is much more extensible. Get a new attribute,or remove attributes and you don't need to change the model.
It will also require less space and make filtering easier in data extraction.

However this is not automatically the only way, or the best way, of doing things in all situations.
A lot of the time with design, especially database design, you have to weigh performance into the equation, performance with maintaining the data and extracting the data, and counter it with storage restrictions and so on. It is a matter of being pragmatic at times and "ignore" the text-book examples.

Speaking specifically in your case, I see no reason to not use the Product, Attribute, AttributeOnProduct type approach. The amount of data - and especial the amount of attributes - will make the other model useless.

ketchupaholic
August 27th, 2009, 06:47 AM
Thank you Alsvha, and I agree. But unfortunately the problem lies in how to get the data into the data base. The only option we have is to import using Excel files, this is the requirement by the manufacturer of the software we are using. That is the big drawback, and the project management's suggestion is going to create a lot of unnecessary work and database size.

Alsvha
August 27th, 2009, 06:53 AM
You can mirror the structure in excel sheets as well.
One sheet/file is products.
One sheet/file is attributes
and one sheet/file is the products on attributes.

Should make it easy enough to import into a database from excel files as well.