CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 9 of 9
  1. #1
    Join Date
    May 2006
    Location
    England
    Posts
    72

    Loading a file and indexing line starts

    Hi All,

    I've spent some time trying to performance test various methods of loading / accessing a file, either into memory or through a memory mapped file, but I am having some trouble.

    FILE * pFile;

    std::ifstream ifs;

    MapViewOfFile

    I'm have trouble because Windows appears to cache the file, so I am getting very mixed results. Sometimes a huge file (100's MB's) will load quickly, other times almost instantly.

    Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?

    Cheers,
    AnotherMuggle

  2. #2
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    2,042

    Re: Loading a file and indexing line starts

    Why not have it load a different file on each run?
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  3. #3
    Join Date
    May 2006
    Location
    England
    Posts
    72

    Re: Loading a file and indexing line starts

    Quote Originally Posted by D_Drmmr View Post
    Why not have it load a different file on each run?
    Thanks for the suggestion. I've been doing something like this but it's annoying having to generate new files each time. Initially I was using a small collection of files and working my way though them but I'm still getting erratic results, even after a reboot. It's almost like Windows knows those files are going to get loaded so it's a step ahead of me.

  4. #4
    2kaud's Avatar
    2kaud is offline Super Moderator Power Poster
    Join Date
    Dec 2012
    Location
    England
    Posts
    7,822

    Re: Loading a file and indexing line starts

    For Windows CreateFile, use FILE_FLAG_NO_BUFFERING. This opens a file with no system caching.
    All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!

    C++23 Compiler: Microsoft VS2022 (17.6.5)

  5. #5
    Join Date
    May 2006
    Location
    England
    Posts
    72

    Re: Loading a file and indexing line starts

    Quote Originally Posted by 2kaud View Post
    For Windows CreateFile, use FILE_FLAG_NO_BUFFERING. This opens a file with no system caching.
    This looks promising.

    Thanks!

  6. #6
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    2,042

    Re: Loading a file and indexing line starts

    Quote Originally Posted by AnotherMuggle View Post
    Thanks for the suggestion. I've been doing something like this but it's annoying having to generate new files each time. Initially I was using a small collection of files and working my way though them but I'm still getting erratic results, even after a reboot. It's almost like Windows knows those files are going to get loaded so it's a step ahead of me.
    You won't get the same results each time when you are profiling file access (at least, not on Windows). That's in the nature of the game.
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  7. #7
    Join Date
    Nov 2000
    Location
    Voronezh, Russia
    Posts
    6,620

    Re: Loading a file and indexing line starts

    Quote Originally Posted by AnotherMuggle View Post
    Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
    Do you know how to prevent hard drive from reading ahead, or using hardware cache? How to exclusively access the hard drive controller in multi-tasking OS? I don't. Sorry, but it seems you're fighting windmills.
    Best regards,
    Igor

  8. #8
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Loading a file and indexing line starts

    Quote Originally Posted by AnotherMuggle View Post
    I'm have trouble because Windows appears to cache the file, so I am getting very mixed results. Sometimes a huge file (100's MB's) will load quickly, other times almost instantly.

    Does anyone know how to stop Windows from doing whatever it's doing, so I can see the real time taken to load/access a file from disk?
    And when you get this "working" on your computer, and that irate customer who now has your code tells you "your program is now slow as a turtle", you will see why trying to outsmart Windows in terms of disk access is a futile attempt, just as others have pointed out.

    The best you can do with file I/O speed is get a balance, and not try to "over-optimize". What looks promising on your machine can turn into a disaster on another. Or just as bad, you get a slow timing on your machine, you spend days or maybe weeks attempting to produce "optimized" I/O code, and in the end you find out you haven't optimized anything, or very little was gained.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; June 18th, 2013 at 05:12 PM.

  9. #9
    Join Date
    Apr 2000
    Location
    Belgium (Europe)
    Posts
    4,626

    Re: Loading a file and indexing line starts

    If you disable the caching in an attempt to do measurements...
    then you will find that your own code will probably take at most 1% of the time with 99% of the time being the loading of data.
    No matter how good you try to optimize your own code, it'll have at most a 1% effect on the totality.

    Proper benchmarking is done by making sure the data IS loaded into the cash so you don't try to measure performance of the harddisk rather than your own code.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured