CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Thumbs up Search a binary file of million records

    My company is on the verge of starting a new project in which the back end is supposed to be a File system instead of a database server. And, this file is going to contain millions of records.

    Since a flat file has to be searched sequentially, it might take a lot of time to retrieve a specific record.
    Some of my collegues told me that, using indexing would make the search much faster. As to now I don't have much of an idea about how to implement indexing.

    I would like to have some of your suggestions & solutions
    regarding this matter..


    any kind of help in this regard would be much appreciated..


    thank you

  2. #2
    Join Date
    Sep 2004
    Location
    Holland (land of the dope)
    Posts
    4,123

    Re: Search a binary file of million records

    ...supposed to be a File system instead of a database server
    This are 2 different things. Don't tell me that you want to use the file system as a database system, because that would be a horrible solution.

  3. #3
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Wink Re: Search a binary file of million records

    But there are lot of softwares that are using file system as backend. I heard that, PeachTree, an US accounting software, uses file system as back end. And, this software is big hit in the US.

  4. #4
    Join Date
    Sep 2004
    Location
    Holland (land of the dope)
    Posts
    4,123

    Re: Search a binary file of million records

    That doesn't mean that it is a good solution The filesystem is slow compared to a database system with a good chunk of memory.

  5. #5
    Join Date
    Oct 2002
    Location
    Singapore
    Posts
    3,128

    Re: Search a binary file of million records

    It doesn't matter whether we are using database or not, ultimately everything is still required to be saved onto the file system to keep data permanently. However, I would advice against re-inventing the wheel as databases have already implemented most stuff that make querying fast which include indexing.
    quoted from C++ Coding Standards:

    KISS (Keep It Simple Software):
    Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.

    Avoid magic number:
    Programming isn't magic, so don't incant it.

  6. #6
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Smile Re: Search a binary file of million records

    what u said is absolutely correct..there are databases that facilitates faster querying and also include indexing..

    but the thing that matters is, the project about which I have mentioned earlier, is meant for commercial purpose.
    Also, it should be highly secure. Another reason is that, we cant go on installing databases to each and every customer due to the cost factor.
    That is why, the company has thought about File system.

    if u can give me some tips regarding ' indexing in file system ', it would be of great help to me...

  7. #7
    Join Date
    Oct 2002
    Location
    Singapore
    Posts
    3,128

    Re: Search a binary file of million records

    In case that you are not aware, you can look for the open source MySQL. The GPL license may need your needs.

    If you still want to venture into creating your own database, you can download the source code to understand how database work, including the indexing system.
    quoted from C++ Coding Standards:

    KISS (Keep It Simple Software):
    Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.

    Avoid magic number:
    Programming isn't magic, so don't incant it.

  8. #8
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Smile Re: Search a binary file of million records

    I still stand with the File system.. as u said, I would like to create my own database.... for that I need your help once more..

    kindly give me the URL from where I could download the source code..


    thank you....

    :-)

  9. #9
    Join Date
    Oct 2002
    Location
    Singapore
    Posts
    3,128

    Re: Search a binary file of million records

    You can google search for it.
    http://www.mysql.com/
    quoted from C++ Coding Standards:

    KISS (Keep It Simple Software):
    Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.

    Avoid magic number:
    Programming isn't magic, so don't incant it.

  10. #10
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Smile Re: Search a binary file of million records

    thanx a lot for the info :-)

  11. #11
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Wink Re: Search a binary file of million records

    I oculd not find any source-code-download-link in that site... all that is available in that site is the database server of MySQL server 6.0...

    do u know specifically where I can find the link to download source code ???

  12. #12
    Join Date
    Sep 2004
    Location
    Holland (land of the dope)
    Posts
    4,123

    Re: Search a binary file of million records

    Sourcecode to what ? Install a MySql driver for ODBC and you can connect to it.

  13. #13
    Join Date
    Oct 2007
    Location
    India
    Posts
    30

    Smile Re: Search a binary file of million records

    earlier Kheun told me that, to create a database of my own, I should download MySQL source code and go thru it to see how it works.

    that is why I asked the link from where I could download MySQL source code..... :-)


    thanx..

  14. #14
    Join Date
    Jun 2002
    Location
    Stockholm, Sweden
    Posts
    1,641

    Re: Search a binary file of million records

    To implement indexing, you usually create a hash table on disk containing keywords.
    Code:
    hash_table_position = hash_function(keyword) % hash_table_size
    Each hash table entry contains a pointer to a position in another file containing the actual words.

    You start looking at hash_table_position and move forward until you encounter a pointer to your keyword or NULL. If you find NULL the keyword does not exist (and you may insert it at that position if you like).

    I have implemented systems like this in the past, so if you have any questions feel free to ask.
    Nobody cares how it works as long as it works

  15. #15
    GCDEF is offline Elite Member Power Poster
    Join Date
    Nov 2003
    Location
    Florida
    Posts
    12,635

    Re: Search a binary file of million records

    Quote Originally Posted by Dineshgirij
    I still stand with the File system.. as u said, I would like to create my own database.... for that I need your help once more..

    kindly give me the URL from where I could download the source code..


    thank you....

    :-)
    That's just not a good plan when there are free and already implemented solutions out there.

    Is the file going to be updated much?

    If you had fixed length records and maintained a sort order and only needed one sort order, you could get reasonable performance with a binary search, but a real database is a better idea.

    Part of being a good programmer is knowing when something is a bad idea and coming up with a better solution.
    Last edited by GCDEF; July 14th, 2008 at 06:53 AM.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured