CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 4 of 4
  1. #1
    Join Date
    Dec 2003
    Posts
    244

    Database design confusion

    I am creating a database that's going to have millions of records. I would like to get answers for following questions:

    1. Is there any dis-advantage of normalized database (like performance de-gradation because of foreign key etc)?
    2. Is there any disadvantage of having varchar field of size 256 and in most of the case I will utilize only 10 bytes ?
    3. What should I take care of in order to writing highly optimized queries.
    4. Is there any disadvantage of having extra columns(reserved), will it increase size on disk even if there is no data?
    5. should I avoid join queries ?
    6. any other thing which I should take care of ?

    -Thanks,
    navin

  2. #2
    Join Date
    Apr 2002
    Location
    Egypt
    Posts
    2,210

    Post Re: Database design confusion

    Quote Originally Posted by navinkaus View Post
    1. Is there any dis-advantage of normalized database (like performance de-gradation because of foreign key etc)?
    Yes, sometimes denormalization can make huge benefits. Like avoiding to join with a huge table.
    Quote Originally Posted by navinkaus View Post
    2. Is there any disadvantage of having varchar field of size 256 and in most of the case I will utilize only 10 bytes ?
    SQL server stores the actual occupied size with each varchar field. And it's 2 or 4 bytes. So this can be a waste. Also there is a slight performance hit. But you'll not lose the whole unused 246 bytes.
    Quote Originally Posted by navinkaus View Post
    3. What should I take care of in order to writing highly optimized queries.
    Lots of things:
    • Use proper indexes
    • Avoid unwanted casts
    • Select only what you need
    • Make sure db statistics are updated
    • avoid per-record function calls or sub-queries if possible
    • consider indexed views when appropriate

    Quote Originally Posted by navinkaus View Post
    4. Is there any disadvantage of having extra columns(reserved), will it increase size on disk even if there is no data?
    It will make a small disk utilization. a bit flag for each nullable field is stored to indicate if it's null or not. I don't prefer this design anyway regardless of space issue.
    Quote Originally Posted by navinkaus View Post
    5. should I avoid join queries ?
    Can you avoid them totally?
    Quote Originally Posted by navinkaus View Post
    6. any other thing which I should take care of ?
    Test on real life data sizes and consider archiving old data.

  3. #3
    Join Date
    Dec 2003
    Posts
    244

    Re: Database design confusion

    Thanks for your reply, further questions to your answers:

    1. How to decide what to normalize what not as initially you don't know much table will grow ?
    2. As I understand in case of varchar If I give size 256 or 2032 it will still take same size on disk as it keeps size in 2 or 4 bytes, only difference is during input/output operation performance might hit as server will have to keep 2032 buffer, right ?
    3. what are db statistics ?
    4. Sub query should be avoided....means should I use join oftenly ?

  4. #4
    Join Date
    Apr 2002
    Location
    Egypt
    Posts
    2,210

    Re: Database design confusion

    Quote Originally Posted by navinkaus View Post
    1. How to decide what to normalize what not as initially you don't know much table will grow ?
    First, design the db to the 3rd normal form as usual. then start denormalizing. If you don't know what tables will grow then this is a problem. What you can do is to monitor the performance regularly and decide to refactor later.
    Quote Originally Posted by navinkaus View Post
    2. As I understand in case of varchar If I give size 256 or 2032 it will still take same size on disk as it keeps size in 2 or 4 bytes, only difference is during input/output operation performance might hit as server will have to keep 2032 buffer, right ?
    I don't think that the buffer allocated is always the maximum size of the field. Logically it should decide the buffer size based on the size indicator. But of course this is only my assumption.
    Quote Originally Posted by navinkaus View Post
    3. what are db statistics ?
    SQL server keeps statistics about data to decide the best execution plan. For example, what is the distribution of values in an index? should the search make index seek or scan? check: SQL Server UPDATE STATISTICS
    Quote Originally Posted by navinkaus View Post
    4. Sub query should be avoided....means should I use join oftenly ?
    What I meant actually is correlated subquery. Where the query is executed per each row. This should be avoided if possible.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured