Loading a file and indexing line starts
Hi All,
I've spent some time trying to performance test various methods of loading / accessing a file, either into memory or through a memory mapped file, but I am having some trouble.
FILE * pFile;
std::ifstream ifs;
MapViewOfFile
I'm have trouble because Windows appears to cache the file, so I am getting very mixed results. Sometimes a huge file (100's MB's) will load quickly, other times almost instantly.
Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
Cheers,
AnotherMuggle
Re: Loading a file and indexing line starts
Why not have it load a different file on each run?
Re: Loading a file and indexing line starts
Quote:
Originally Posted by
D_Drmmr
Why not have it load a different file on each run?
Thanks for the suggestion. I've been doing something like this but it's annoying having to generate new files each time. Initially I was using a small collection of files and working my way though them but I'm still getting erratic results, even after a reboot. It's almost like Windows knows those files are going to get loaded so it's a step ahead of me.
Re: Loading a file and indexing line starts
For Windows CreateFile, use FILE_FLAG_NO_BUFFERING. This opens a file with no system caching.
Re: Loading a file and indexing line starts
Quote:
Originally Posted by
2kaud
For Windows CreateFile, use FILE_FLAG_NO_BUFFERING. This opens a file with no system caching.
This looks promising.
Thanks!
Re: Loading a file and indexing line starts
Quote:
Originally Posted by
AnotherMuggle
Thanks for the suggestion. I've been doing something like this but it's annoying having to generate new files each time. Initially I was using a small collection of files and working my way though them but I'm still getting erratic results, even after a reboot. It's almost like Windows knows those files are going to get loaded so it's a step ahead of me.
You won't get the same results each time when you are profiling file access (at least, not on Windows). That's in the nature of the game.
Re: Loading a file and indexing line starts
Quote:
Originally Posted by
AnotherMuggle
Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
Do you know how to prevent hard drive from reading ahead, or using hardware cache? How to exclusively access the hard drive controller in multi-tasking OS? I don't. Sorry, but it seems you're fighting windmills.
Re: Loading a file and indexing line starts
Quote:
Originally Posted by
AnotherMuggle
I'm have trouble because Windows appears to cache the file, so I am getting very mixed results. Sometimes a huge file (100's MB's) will load quickly, other times almost instantly.
Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
And when you get this "working" on your computer, and that irate customer who now has your code tells you "your program is now slow as a turtle", you will see why trying to outsmart Windows in terms of disk access is a futile attempt, just as others have pointed out.
The best you can do with file I/O speed is get a balance, and not try to "over-optimize". What looks promising on your machine can turn into a disaster on another. Or just as bad, you get a slow timing on your machine, you spend days or maybe weeks attempting to produce "optimized" I/O code, and in the end you find out you haven't optimized anything, or very little was gained.
Regards,
Paul McKenzie
Re: Loading a file and indexing line starts
If you disable the caching in an attempt to do measurements...
then you will find that your own code will probably take at most 1% of the time with 99% of the time being the loading of data.
No matter how good you try to optimize your own code, it'll have at most a 1% effect on the totality.
Proper benchmarking is done by making sure the data IS loaded into the cash so you don't try to measure performance of the harddisk rather than your own code.