-
StretchBlt slow, can I use directx somehow?
Hi,
I have a constantly changing 300x216 byte array which contains a bitmap essentially. It is actually a CDG (karaoke) file streaming in. So each time I get the bitmap data it is changing. I won't want to display it at 300x216, I want to display it at 1024x768, so I use a call to the GDI stretchblt function (this is called constantly as I receive the bitmap data), but it is very slow. CPU usage jumps very high when stretching this 300x216 image to 1024x768 using this method. Whne I use bitblt at 300x216 everything works nice and fast.
Is there some way I can make this faster using stretchblt. Is there maybe a way I can use the DirectX StretchBlt call to send the image data to my CStatic Device Context? Is there maybe some way I can manipulate the byte array to change the size of the image and then use the regular bitblt call?
Any ideas on how to make this faster?
Thanks,
Greg
-
Re: StretchBlt slow, can I use directx somehow?
Just some things you can give thought to.
I think this has to do with at what rate are you drawing the view (invalidating the view). You may be invalidating the view too soon while the window is being drawn for the previous invalidate command.
Perhaps you can use a timer to control when you are invalidating the window.
Or you can handle the incoming stream of data in a separate thread and post a message to the window to invalidate the view after certain amount of data arrives. This way your data can keep coming while the view is drawing itself.
You can also try using a memory DC. First blit into the memory DC then from the memory DC to the windows DC.
-
Re: StretchBlt slow, can I use directx somehow?
Hi,
thanks for the thoughts. I am actually using a multimedia timer to update the screen. My settings are this
timeSetEvent(100,250, (LPTIMERCALLBACK)&TimerProc, (DWORD)this, TIME_PERIODIC)
I am handling the streaming data in a worker thread as well.
I am also using the memdc class.
however I seem to get the same slow results.
I wish there was a faster way....
Thanks
-
Re: StretchBlt slow, can I use directx somehow?
Quote:
Originally Posted by miteshpandey
You can also try using a memory DC. First blit into the memory DC then from the memory DC to the windows DC.
Double buffering can be used to prevent flickering, but is slower in every case I used it. Maybe you can point out why/how this should speed up the display process?
-
Re: StretchBlt slow, can I use directx somehow?
Quote:
Originally Posted by Oliver M.
Double buffering can be used to prevent flickering, but is slower in every case I used it. Maybe you can point out why/how this should speed up the display process?
Yeah, you are right that double buffering can be used to avoid flickering but not to improve performance.
Just thought that the OP might be experiencing flickering and to remedy this he is looking for faster methods.
-
Re: StretchBlt slow, can I use directx somehow?
If you want speed, don't use Gdi at all, use OpenGL. In my tests I found that OpenGL is about 15 times faster than Gdi.
-
Re: StretchBlt slow, can I use directx somehow?
This can be done in directshow/directx, but it is probably more trouble than it is worth. You might want to try resizing using different algorithms rather than stretchblt. For instance, the intel IPP has very fast resizing algorithms. If you prefer free (and who doesn't) you can take a look at the ffdshow code at sourceforge. That project has assembly versions of resizing algorithms that are highly efficient. Good luck.
~Steve
-
Re: StretchBlt slow, can I use directx somehow?
How could they be more efficient if they are not hardware accelerated?
-
Re: StretchBlt slow, can I use directx somehow?
Hi aewarnick,
Well the efficiency will be dependent on your video card/ processor combo. The ipp and ffdshow algorithms besides being efficient, also take advantage of the mmx/sse2 instructions on the p3/p4. This helps if you are say on a laptop where your graphics card is not powerful. Also, if you are rendering in video memory, then that buffer is essentially lost to you unless you do the very expensive step of reading from video memory. However, if you are running the latest generation of graphics card, then I would agree with you that opengl/directx would be faster.
~Steve
P.S. To the OP, have you tried picking the lowest quality level for resizing in stretchblt?
-
Re: StretchBlt slow, can I use directx somehow?
I take it that the mmx advantage is only for intel processors?
-
Re: StretchBlt slow, can I use directx somehow?
MMX is available on newer AMD processors as well.
As for effienciency, stretchblt wasn't made to be fast, as it wasn't intended to be called constantly. Ways other functions could be more efficient regardless of CPU specific additions/instructions include code and design improvements, as well as filtering issues (stretch something in MS paint compared to a filtered/aliased image editor - big difference in end result).
As for using hardware, whether it be through OpenGL or DirectX, that's not an insignificant amount of work/learning for something this trivial, but you can find tutorials and public source for this all over, including www.gamedev.net.
-Alamar
-
Re: StretchBlt slow, can I use directx somehow?
Quote:
Originally Posted by diehardii
P.S. To the OP, have you tried picking the lowest quality level for resizing in stretchblt?
How does one pick the quality of the stretchcliting? I am having a similar issue where must stretch blt a bitmap very often..on computers with decent graphics cards its not a problem....but on some PCs its soo slow that it puts the computer out of use while this stretchbliting is running.. I considered IPP but can the IPP technique be faster than the actual hardware/graphics cards implementation?
Aristotel
-
Re: StretchBlt slow, can I use directx somehow?
Hi, setstretchbltmode will give you more options in gdi. In gdi+ there are even more options. I'm not sure, but gdi+ may use more of the hardware capabilities. My impression is that gdi+ is not a just a thin wrapper for gdi. Here is a link
http://msdn.microsoft.com/library/de...highdpiapp.asp
~Steve
-
Re: StretchBlt slow, can I use directx somehow?
Just remember Gdi+ does not work at all on win95.
-
Re: StretchBlt slow, can I use directx somehow?
While that may be true, many programmers are no longer constrained to develop on those platforms :p . If you don't have those constraints, gdi+ is a bit easier.
~Steve
-
Re: StretchBlt slow, can I use directx somehow?
I've done some tests with StretchBlt and can't understand why you're having problems.
I can stretch a 1920x1280 bitmap down to 800x450 in about 4 milliseconds using StretchBlt.
I've rewritten the algorithm in ML using Bresenhem's algorithm and I can't do any better than this in software.
I think you might be using one of the slower stretching modes. Try
Code:
CClientDC dc(NULL);
::SetStretchBltMode(dc.m_hDC, COLORONCOLOR);
dc.StretchBlt(.....
Of course this isn't as high quality as the stretching using Direct3D surfaces (and these have the added advantage of applying a gamma setting to your bitmaps) but it does work with all graphics cards.
Direct3D in DirectX9 requires that stretching be a feature of the graphics card : it's no longer available in software. So you may find older machines don't support it.
For this very reason I'm keeping my stretching in software for the time being.
If you're using a 32-bit DIB section and stretching it onto a 16-bit display this can cause a performance hit though... bear that in mind.
Darwen.
-
Re: StretchBlt slow, can I use directx somehow?
Hi Darwen,
How fast is your test when scaling up?
~Steve
-
Re: StretchBlt slow, can I use directx somehow?
Try scaling a 300x216 bitmap up to 1024x768 using stretchblt. That is what I am doing and it is very slow if you want to get a decent framerate.
I have been trying a few things, but it looks like I will have to go with directx or opengl. It sucks though because I know it is possible to do what I want with the GDI because the winamp cd+G/mp3+g plugin is scaling the cdg data to fullscreen using the gdi with a very very minimal cpu usage. I just don't know how they are doing it.
Cheers
-
Re: StretchBlt slow, can I use directx somehow?
how about some test code? What are your computer specs?
~Steve
-
Re: StretchBlt slow, can I use directx somehow?
Scaling up is about the same speed as scaling down really. Around about 4 ms from my tests. That's on a 2.8GHz P4, but it's in a laptop so it's not quite that fast.
Personally I've never had a problem with speed with StretchBlt - I've just written my own algorithm in Assembler to do the stretch because I'm dealing primarily with 32-bit DIBs and the additional overhead of a conversion to a 16-bit display is removed by manipulating the bitmaps myself right until the last moment.
Presumably if I used MMX I could get even more speed out of it but I've heard that the difference in speed between MMX and raw unwound Assembler is minimal in Pentium 3 and 4s because of their cache size. I read somewhere that MMX was primarily invented to overcome the cache size in Pentium 2's, which isn't a problem with modern processors because of better caching and the multiple instruction pipelines.
Darwen.
-
Re: StretchBlt slow, can I use directx somehow?
Hi Darwen,
Thanks for weighing in on this, I enjoyed your assembly tutorials. I would have difficulty believing the mmx in assembly is not faster than hand rolled assembly for many tasks (note the caveat). For instance, take a look at:
http://www.avisynth.org/IntermediateMmxOptimization
I would think that despite any latency, you will still be faster as you are processing 4 pixels to every one pixel without mmx. In other, simpler, cases you can process 8 pixels at once. It would be interesting to see how close one could get to mmx with vanilla assembly in this case. Back OT, are you using gdi or gdi+ for your tests?
~Steve
EDIT: I have read that compiler intrinsics are slower than assembly, so perhaps in that case it is true that mmx is slower than assembly.
-
Re: StretchBlt slow, can I use directx somehow?
MMX makes use of the floating point processor to do things in parallel (the FPU).
The 'emms' instruction basically restores the FPU's registers back to their original values after MMX instructions.
However the overhead in using the floating point processor is outweighed by the multiple instruction pipelines in modern day processors providing the source assembly is designed to take advantage of this. Apparently - according to my reasearch at any rate.
If you're using DirectX then you're using bitmap manipulation capabilities on the graphics card which will of course be very much faster.
However : a word of warning. As I believe I've already said, DirectX9 assumes that the bitmap stretching and other functions are provided by the hardware and doesn't have a software fallback.
If you use DirectX9 you must be aware that anyone using a graphics card which doesn't support hardware stretching (say earlier than an NVidia 64Mb card) will not be able to run your application.
Darwen.