|
-
April 26th, 2005, 11:15 PM
#16
Re: StretchBlt slow, can I use directx somehow?
I've done some tests with StretchBlt and can't understand why you're having problems.
I can stretch a 1920x1280 bitmap down to 800x450 in about 4 milliseconds using StretchBlt.
I've rewritten the algorithm in ML using Bresenhem's algorithm and I can't do any better than this in software.
I think you might be using one of the slower stretching modes. Try
Code:
CClientDC dc(NULL);
::SetStretchBltMode(dc.m_hDC, COLORONCOLOR);
dc.StretchBlt(.....
Of course this isn't as high quality as the stretching using Direct3D surfaces (and these have the added advantage of applying a gamma setting to your bitmaps) but it does work with all graphics cards.
Direct3D in DirectX9 requires that stretching be a feature of the graphics card : it's no longer available in software. So you may find older machines don't support it.
For this very reason I'm keeping my stretching in software for the time being.
If you're using a 32-bit DIB section and stretching it onto a 16-bit display this can cause a performance hit though... bear that in mind.
Darwen.
-
April 27th, 2005, 07:58 AM
#17
Re: StretchBlt slow, can I use directx somehow?
Hi Darwen,
How fast is your test when scaling up?
~Steve
-
April 27th, 2005, 10:17 AM
#18
Re: StretchBlt slow, can I use directx somehow?
Try scaling a 300x216 bitmap up to 1024x768 using stretchblt. That is what I am doing and it is very slow if you want to get a decent framerate.
I have been trying a few things, but it looks like I will have to go with directx or opengl. It sucks though because I know it is possible to do what I want with the GDI because the winamp cd+G/mp3+g plugin is scaling the cdg data to fullscreen using the gdi with a very very minimal cpu usage. I just don't know how they are doing it.
Cheers
-
April 27th, 2005, 11:25 AM
#19
Re: StretchBlt slow, can I use directx somehow?
how about some test code? What are your computer specs?
~Steve
-
April 27th, 2005, 03:28 PM
#20
Re: StretchBlt slow, can I use directx somehow?
Scaling up is about the same speed as scaling down really. Around about 4 ms from my tests. That's on a 2.8GHz P4, but it's in a laptop so it's not quite that fast.
Personally I've never had a problem with speed with StretchBlt - I've just written my own algorithm in Assembler to do the stretch because I'm dealing primarily with 32-bit DIBs and the additional overhead of a conversion to a 16-bit display is removed by manipulating the bitmaps myself right until the last moment.
Presumably if I used MMX I could get even more speed out of it but I've heard that the difference in speed between MMX and raw unwound Assembler is minimal in Pentium 3 and 4s because of their cache size. I read somewhere that MMX was primarily invented to overcome the cache size in Pentium 2's, which isn't a problem with modern processors because of better caching and the multiple instruction pipelines.
Darwen.
Last edited by darwen; April 27th, 2005 at 03:31 PM.
-
April 27th, 2005, 03:48 PM
#21
Re: StretchBlt slow, can I use directx somehow?
Hi Darwen,
Thanks for weighing in on this, I enjoyed your assembly tutorials. I would have difficulty believing the mmx in assembly is not faster than hand rolled assembly for many tasks (note the caveat). For instance, take a look at:
http://www.avisynth.org/IntermediateMmxOptimization
I would think that despite any latency, you will still be faster as you are processing 4 pixels to every one pixel without mmx. In other, simpler, cases you can process 8 pixels at once. It would be interesting to see how close one could get to mmx with vanilla assembly in this case. Back OT, are you using gdi or gdi+ for your tests?
~Steve
EDIT: I have read that compiler intrinsics are slower than assembly, so perhaps in that case it is true that mmx is slower than assembly.
Last edited by diehardii; April 27th, 2005 at 04:02 PM.
-
April 27th, 2005, 06:13 PM
#22
Re: StretchBlt slow, can I use directx somehow?
MMX makes use of the floating point processor to do things in parallel (the FPU).
The 'emms' instruction basically restores the FPU's registers back to their original values after MMX instructions.
However the overhead in using the floating point processor is outweighed by the multiple instruction pipelines in modern day processors providing the source assembly is designed to take advantage of this. Apparently - according to my reasearch at any rate.
If you're using DirectX then you're using bitmap manipulation capabilities on the graphics card which will of course be very much faster.
However : a word of warning. As I believe I've already said, DirectX9 assumes that the bitmap stretching and other functions are provided by the hardware and doesn't have a software fallback.
If you use DirectX9 you must be aware that anyone using a graphics card which doesn't support hardware stretching (say earlier than an NVidia 64Mb card) will not be able to run your application.
Darwen.
Last edited by darwen; April 27th, 2005 at 06:17 PM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|