-
December 29th, 2008, 02:58 AM
#1
Need Fast File Reads (binary data)
Hi,
I have been searching this forum for fast file reads, but didn't find anything suited for me.
ok, here is the scenerio
we have around 500 files, each with around 6.7MB ( all files are raw image files, same size. the number of files can be even greater).
based on some calculation, i need to read a block of data from all files ( same block is read from all files) to a single memory buffer.
the (total)size of block to be read may vary ( 1MB to 50MB).
my current code looks like
Code:
for( int nProj = 0; nProj < g_params->nTotal_Projections; ++ nProj)
{
// code to eliminate duplicte data fetch
fseek(fileArray[nProj],lBeginOffset* sizeof(float), SEEK_SET );
fread(ipp->fIPProjections_o + offset , sizeof(float),lElementstoRead, fileArray[nProj]);
}
The problem is the througput is very low, when i time this loop. i have seen that some times it takes around >10s to read ~5MB of data . ( the data is read, processed and writen back to some other files, same drive. the reading, processing and writing are done from different threads ).
I tried it after disabling the virus scanners, but of no use.
Now my HDD is 250GB SATA ||, and my HDD diagnostic utility reports me an average speed of 60MBps.
I modified this code to make use of win api. but the results were not positive.
Then i made all the 500 files to a single file (~3GB), defragmented the drive, modifed the code to read it from that, and timed again. still the results were not as required.
Now this part became the projects bottleneck.
Hope some one can shed some light to this problem.
Thanks in advance
-
December 29th, 2008, 04:31 AM
#2
Re: Need Fast File Reads (binary data)
On Windows, CreateFile, ReadFile, WriteFile performs better than fopen, fread, fwite counterparts. fread internally calls ReadFile, after performing few checks.
Second, you should not issue fseek again, since file-pointer for "read" would already be located at proper location.
You may also think of IOCP...
-
December 29th, 2008, 04:58 AM
#3
Re: Need Fast File Reads (binary data)
Originally Posted by Ajay Vijay
On Windows, CreateFile, ReadFile, WriteFile performs better than fopen, fread, fwite counterparts. fread internally calls ReadFile, after performing few checks.
tried that .. but, was out of luck.. the delay is in "read" part. the crt-winapi overhead is far lower.
Second, you should not issue fseek again, since file-pointer for "read" would already be located at proper location.
You may also think of IOCP...
the fseek is for next file.
iocp? do you believe it will be faster?
Thanks for the reply
-
December 29th, 2008, 05:21 AM
#4
Re: Need Fast File Reads (binary data)
Well, IOCP isn't solution to this slow-read problem.
I did not look at the code carefully, my mistake.
Presuming you are running the Release Build. Can you try reading blocks of smaller sizes, few KBs in loops? Is there a difference? What about increasing the size of blocks?
I still beleive ReadFile would be better than fread.
-
December 29th, 2008, 08:54 AM
#5
Re: Need Fast File Reads (binary data)
Well, IMO you have hit the wall, you need high end hardware.
Regards,
Ramkrishna Pawar
-
December 29th, 2008, 09:00 AM
#6
Re: Need Fast File Reads (binary data)
Now my HDD is 250GB SATA ||, and my HDD diagnostic utility reports me an average speed of 60MBps.
Not very high end, but can read 5MB in <1second! Thus, IMO, hardware is not bottleneck. Windows Media Player can open 5-10 MB of audio (.MP3) in less than 1 second.
-
December 29th, 2008, 07:20 PM
#7
Re: Need Fast File Reads (binary data)
Originally Posted by akgalp
The problem is the througput is very low, when i time this loop. i have seen that some times it takes around >10s to read ~5MB of data . ( the data is read, processed and writen back to some other files, same drive. the reading, processing and writing are done from different threads ).
How sure are you that your problem is reading of these files?
How exactly did you "time" that?
Could you modify your program to only read the data needed (remove processing and writing code)? Do you still see those 10 second delays?
How do you synchronize your reading and processing threads?
How do you protect shared data while one thread writes it and another one – reads?
Vlad - MS MVP [2007 - 2012] - www.FeinSoftware.com
Convenience and productivity tools for Microsoft Visual Studio:
FeinWindows - replacement windows manager for Visual Studio, and more...
-
December 29th, 2008, 10:29 PM
#8
Re: Need Fast File Reads (binary data)
Originally Posted by Ajay Vijay
Presuming you are running the Release Build. Can you try reading blocks of smaller sizes, few KBs in loops? Is there a difference? What about increasing the size of blocks?
i have seen that, larger the read block , better the throughput. but in this case i cannot increase the block size, because i already have around 600MB of pinned memory.. any increase will increase this buffer to a much larger value.
I still beleive ReadFile would be better than fread.
ok thanks.. i'll try it again and post the findings
-
December 29th, 2008, 10:41 PM
#9
Re: Need Fast File Reads (binary data)
Originally Posted by VladimirF
How sure are you that your problem is reading of these files?
How exactly did you "time" that?
like this
Code:
nTimeStart = GetTickCount();
// load the input buffers
for( int nProj = 0; nProj < g_params->nTotal_Projections; ++ nProj)
{
//... data duplication removal code...//
....
fseek(fileArray[nProj],lBeginOffset* sizeof(float), SEEK_SET );
fread(ipp->fIPProjections_o + offset, sizeof(float),lElementstoRead, fileArray[nProj]);
}
printf("Time taken to read 500 files %d ms \n", GetTickCount() - nTimeStart );
unsigned x = lElementstoRead * 4 * 500;
float speed = x / ((GetTickCount() - nTimeStart) == 0? 1 : (GetTickCount() - nTimeStart));
printf(">Buffer %d KB, Disk read @ %8.0f KBps \n\n", x>>10, speed)
Could you modify your program to only read the data needed (remove processing and writing code)? Do you still see those 10 second delays?
that was already done. now the data read is betweeb 8-11MB
these are the sample timings
Code:
Time taken to read 500 files 4312 ms
>Buffer 8378 KB, Disk read @ 1989 KBps
*********************************************** - 15578 ms
Time taken to read 500 files 4250 ms
>Buffer 11171 KB, Disk read @ 2691 KBps
*********************************************** - 19984 ms
Time taken to read 500 files 4875 ms
>Buffer 8378 KB, Disk read @ 1760 KBps
*********************************************** - 24953 ms
Time taken to read 500 files 3890 ms
>Buffer 11171 KB, Disk read @ 2940 KBps
*********************************************** - 29000 ms
Time taken to read 500 files 3985 ms
>Buffer 8378 KB, Disk read @ 2153 KBps
*********************************************** - 33156 ms
Time taken to read 500 files 4359 ms
>Buffer 11171 KB, Disk read @ 2624 KBps
*********************************************** - 37687 ms
Time taken to read 500 files 4016 ms
>Buffer 8378 KB, Disk read @ 2136 KBps
*********************************************** - 41844 ms
Time taken to read 500 files 4110 ms
>Buffer 11171 KB, Disk read @ 2783 KBps
*********************************************** - 46078 ms
Time taken to read 500 files 4375 ms
>Buffer 8378 KB, Disk read @ 1961 KBps
*********************************************** - 50625 ms
Time taken to read 500 files 3985 ms
>Buffer 11171 KB, Disk read @ 2870 KBps
*********************************************** - 54812 ms
Time taken to read 500 files 4078 ms
>Buffer 8378 KB, Disk read @ 2103 KBps
*********************************************** - 59047 ms
How do you synchronize your reading and processing threads?
How do you protect shared data while one thread writes it and another one – reads?
sync done using events, the processing thread will wait until the read is complete. there is no shared buffer between writing and reading threads.
-
December 29th, 2008, 11:05 PM
#10
Re: Need Fast File Reads (binary data)
Use GetThreadTimes to perfectly determine time used by specific thread (process).
-
December 30th, 2008, 01:31 AM
#11
Re: Need Fast File Reads (binary data)
i modified the code to use win api
Code:
for( int nProj = 0; nProj < g_params->nTotal_Projections; ++ nProj)
{
//fseek(fileArray[nProj],lBeginOffset* sizeof(float), SEEK_SET );
SetFilePointer( fileArray[nProj], lBeginOffset* sizeof(float), 0, FILE_BEGIN);
//fread(ipp->fIPProjections_o + frameoffset + memoryoffset, sizeof(float),lElementstoRead, fileArray[nProj]);
ReadFile(fileArray[nProj],ipp->fIPProjections_o + frameoffset + memoryoffset,sizeof(float)*lElementstoRead, &read,0);
}
and results
Code:
Time taken to read 500 files 173640 ms
>Buffer 399394 KB, Disk read @ 2355 KBps
// after the first read only differential is read
Time taken to read 500 files 3766 ms
>Buffer 8378 KB, Disk read @ 2278 KBps
Time taken to read 500 files 4047 ms
>Buffer 11171 KB, Disk read @ 2826 KBps
Time taken to read 500 files 3828 ms
>Buffer 8378 KB, Disk read @ 2241 KBps
Time taken to read 500 files 3875 ms
>Buffer 11171 KB, Disk read @ 2952 KBps
Time taken to read 500 files 3906 ms
>Buffer 8378 KB, Disk read @ 2196 KBps
Time taken to read 500 files 3890 ms
>Buffer 11171 KB, Disk read @ 2940 KBps
Time taken to read 500 files 3828 ms
>Buffer 8378 KB, Disk read @ 2241 KBps
-
December 31st, 2008, 12:56 AM
#12
Re: Need Fast File Reads (binary data)
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|