-
June 14th, 2013, 10:41 AM
#1
Loading a file and indexing line starts
Hi All,
I've spent some time trying to performance test various methods of loading / accessing a file, either into memory or through a memory mapped file, but I am having some trouble.
FILE * pFile;
std::ifstream ifs;
MapViewOfFile
I'm have trouble because Windows appears to cache the file, so I am getting very mixed results. Sometimes a huge file (100's MB's) will load quickly, other times almost instantly.
Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
Cheers,
AnotherMuggle
-
June 15th, 2013, 11:21 AM
#2
Re: Loading a file and indexing line starts
Why not have it load a different file on each run?
Cheers, D Drmmr
Please put [code][/code] tags around your code to preserve indentation and make it more readable.
As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky
-
June 15th, 2013, 11:28 AM
#3
Re: Loading a file and indexing line starts
Originally Posted by D_Drmmr
Why not have it load a different file on each run?
Thanks for the suggestion. I've been doing something like this but it's annoying having to generate new files each time. Initially I was using a small collection of files and working my way though them but I'm still getting erratic results, even after a reboot. It's almost like Windows knows those files are going to get loaded so it's a step ahead of me.
-
June 15th, 2013, 11:37 AM
#4
Re: Loading a file and indexing line starts
For Windows CreateFile, use FILE_FLAG_NO_BUFFERING. This opens a file with no system caching.
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.6.5)
-
June 16th, 2013, 02:39 AM
#5
Re: Loading a file and indexing line starts
Originally Posted by 2kaud
For Windows CreateFile, use FILE_FLAG_NO_BUFFERING. This opens a file with no system caching.
This looks promising.
Thanks!
-
June 17th, 2013, 01:39 AM
#6
Re: Loading a file and indexing line starts
Originally Posted by AnotherMuggle
Thanks for the suggestion. I've been doing something like this but it's annoying having to generate new files each time. Initially I was using a small collection of files and working my way though them but I'm still getting erratic results, even after a reboot. It's almost like Windows knows those files are going to get loaded so it's a step ahead of me.
You won't get the same results each time when you are profiling file access (at least, not on Windows). That's in the nature of the game.
Cheers, D Drmmr
Please put [code][/code] tags around your code to preserve indentation and make it more readable.
As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky
-
June 18th, 2013, 02:23 PM
#7
Re: Loading a file and indexing line starts
Originally Posted by AnotherMuggle
Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
Do you know how to prevent hard drive from reading ahead, or using hardware cache? How to exclusively access the hard drive controller in multi-tasking OS? I don't. Sorry, but it seems you're fighting windmills.
Best regards,
Igor
-
June 18th, 2013, 03:13 PM
#8
Re: Loading a file and indexing line starts
Originally Posted by AnotherMuggle
I'm have trouble because Windows appears to cache the file, so I am getting very mixed results. Sometimes a huge file (100's MB's) will load quickly, other times almost instantly.
Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
And when you get this "working" on your computer, and that irate customer who now has your code tells you "your program is now slow as a turtle", you will see why trying to outsmart Windows in terms of disk access is a futile attempt, just as others have pointed out.
The best you can do with file I/O speed is get a balance, and not try to "over-optimize". What looks promising on your machine can turn into a disaster on another. Or just as bad, you get a slow timing on your machine, you spend days or maybe weeks attempting to produce "optimized" I/O code, and in the end you find out you haven't optimized anything, or very little was gained.
Regards,
Paul McKenzie
Last edited by Paul McKenzie; June 18th, 2013 at 05:12 PM.
-
June 19th, 2013, 06:43 AM
#9
Re: Loading a file and indexing line starts
If you disable the caching in an attempt to do measurements...
then you will find that your own code will probably take at most 1% of the time with 99% of the time being the loading of data.
No matter how good you try to optimize your own code, it'll have at most a 1% effect on the totality.
Proper benchmarking is done by making sure the data IS loaded into the cash so you don't try to measure performance of the harddisk rather than your own code.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|