-
July 14th, 2008, 01:44 AM
#1
Search a binary file of million records
My company is on the verge of starting a new project in which the back end is supposed to be a File system instead of a database server. And, this file is going to contain millions of records.
Since a flat file has to be searched sequentially, it might take a lot of time to retrieve a specific record.
Some of my collegues told me that, using indexing would make the search much faster. As to now I don't have much of an idea about how to implement indexing.
I would like to have some of your suggestions & solutions
regarding this matter..
any kind of help in this regard would be much appreciated..
thank you
-
July 14th, 2008, 02:24 AM
#2
Re: Search a binary file of million records
...supposed to be a File system instead of a database server
This are 2 different things. Don't tell me that you want to use the file system as a database system, because that would be a horrible solution.
-
July 14th, 2008, 03:03 AM
#3
Re: Search a binary file of million records
But there are lot of softwares that are using file system as backend. I heard that, PeachTree, an US accounting software, uses file system as back end. And, this software is big hit in the US.
-
July 14th, 2008, 03:25 AM
#4
Re: Search a binary file of million records
That doesn't mean that it is a good solution The filesystem is slow compared to a database system with a good chunk of memory.
-
July 14th, 2008, 03:25 AM
#5
Re: Search a binary file of million records
It doesn't matter whether we are using database or not, ultimately everything is still required to be saved onto the file system to keep data permanently. However, I would advice against re-inventing the wheel as databases have already implemented most stuff that make querying fast which include indexing.
quoted from C++ Coding Standards:
KISS (Keep It Simple Software):
Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.
Avoid magic number:
Programming isn't magic, so don't incant it.
-
July 14th, 2008, 03:37 AM
#6
Re: Search a binary file of million records
what u said is absolutely correct..there are databases that facilitates faster querying and also include indexing..
but the thing that matters is, the project about which I have mentioned earlier, is meant for commercial purpose.
Also, it should be highly secure. Another reason is that, we cant go on installing databases to each and every customer due to the cost factor.
That is why, the company has thought about File system.
if u can give me some tips regarding ' indexing in file system ', it would be of great help to me...
-
July 14th, 2008, 04:25 AM
#7
Re: Search a binary file of million records
In case that you are not aware, you can look for the open source MySQL. The GPL license may need your needs.
If you still want to venture into creating your own database, you can download the source code to understand how database work, including the indexing system.
quoted from C++ Coding Standards:
KISS (Keep It Simple Software):
Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.
Avoid magic number:
Programming isn't magic, so don't incant it.
-
July 14th, 2008, 04:40 AM
#8
Re: Search a binary file of million records
I still stand with the File system.. as u said, I would like to create my own database.... for that I need your help once more..
kindly give me the URL from where I could download the source code..
thank you....
:-)
-
July 14th, 2008, 05:23 AM
#9
Re: Search a binary file of million records
You can google search for it.
http://www.mysql.com/
quoted from C++ Coding Standards:
KISS (Keep It Simple Software):
Correct is better than fast. Simple is better than complex. Clear is better than cute. Safe is better than insecure.
Avoid magic number:
Programming isn't magic, so don't incant it.
-
July 14th, 2008, 05:43 AM
#10
Re: Search a binary file of million records
thanx a lot for the info :-)
-
July 14th, 2008, 06:06 AM
#11
Re: Search a binary file of million records
I oculd not find any source-code-download-link in that site... all that is available in that site is the database server of MySQL server 6.0...
do u know specifically where I can find the link to download source code ???
-
July 14th, 2008, 06:17 AM
#12
Re: Search a binary file of million records
Sourcecode to what ? Install a MySql driver for ODBC and you can connect to it.
-
July 14th, 2008, 06:26 AM
#13
Re: Search a binary file of million records
earlier Kheun told me that, to create a database of my own, I should download MySQL source code and go thru it to see how it works.
that is why I asked the link from where I could download MySQL source code..... :-)
thanx..
-
July 14th, 2008, 06:44 AM
#14
Re: Search a binary file of million records
To implement indexing, you usually create a hash table on disk containing keywords.
Code:
hash_table_position = hash_function(keyword) % hash_table_size
Each hash table entry contains a pointer to a position in another file containing the actual words.
You start looking at hash_table_position and move forward until you encounter a pointer to your keyword or NULL. If you find NULL the keyword does not exist (and you may insert it at that position if you like).
I have implemented systems like this in the past, so if you have any questions feel free to ask.
Nobody cares how it works as long as it works
-
July 14th, 2008, 06:48 AM
#15
Re: Search a binary file of million records
Originally Posted by Dineshgirij
I still stand with the File system.. as u said, I would like to create my own database.... for that I need your help once more..
kindly give me the URL from where I could download the source code..
thank you....
:-)
That's just not a good plan when there are free and already implemented solutions out there.
Is the file going to be updated much?
If you had fixed length records and maintained a sort order and only needed one sort order, you could get reasonable performance with a binary search, but a real database is a better idea.
Part of being a good programmer is knowing when something is a bad idea and coming up with a better solution.
Last edited by GCDEF; July 14th, 2008 at 06:53 AM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|