-
February 19th, 2013, 10:23 AM
#1
Bilinear Resampling Probelm
Hi all, I hope that somone here can help me out as I'm racking my brain over this and something is telling me that should be a realtively simple problem to solve but I just can't see the wood for the trees.
In short I'm converting a floating point bilinear image resampling routine into one that only uses fixed point arithmetic. I've gotten rid of nearly all the floats now, in fact all but one and the results at the moment are in distinguishable from the floating point version. It's a maths issue really so I hope that there's a whizz on here that can help. Some pseudocode goes like this.
Code:
for( int xx=0; xx<ow; xx++ )
{
int_center = (ccx >> 16);
int temp = xx * 2;
for (j = int_center; j <= int_center + 1; j++)
{
t_weight = abs((int)(float)(((int)ccx - (j << 16)) * FILTER_FACTOR));
int_weight = t_weight < 65536 ? 65536 - t_weight : 0;
hv_pixel[temp + index] = j;
int_hv_weight[temp + index] = int_weight;
++index;
}/* j */
ccx += dx;
index = 0;
}
Where ccx is and integer error accumulator that gives me a scaled integer. Shifting down buy 16 gives me the relative pixel I need to be working on. The line just after where the inner loop begins is where I have the last remaining float. FILTER_FACTOR is essentially a percentage by which I scale the error accumulator to the correct amount.
For example.
ccx = 98303. Which is a value of 1.5 when shited down by 16 bits. Obviously I can shift it because it will round and I lose the precision. Lets say FILTER_FACTOR is 39321. Which is 60% of 1 (65535) So what I'd like to know is, is it possible to use the FILTER_FACTOR as an integer and do some fancy integer math to scale the result from (ccx - (j<<16)) by the representative amount that is FILTER_FACTOR. In this example 60%. Effectively getting 40% of (ccx - (j<<16)) At the moment FILTER_FACTOR is still a float and therefore 0.6, which of course works just fine.
Many thanks for all the replies in advance.
-
February 19th, 2013, 02:36 PM
#2
Re: Bilinear Resampling Probelm
In short I'm converting a floating point bilinear image resampling routine into one that only uses fixed point arithmetic
How are you doing that when you're casting to float?
Code:
abs((int)(float)(((int)ccx - (j << 16)) * FILTER_FACTOR));
You've introduced floating point as soon as you casted.
Regards,
Paul McKenzie
-
February 20th, 2013, 08:25 AM
#3
Re: Bilinear Resampling Probelm
1) WHY would yo convert from floating point to fixed point integer ? On a PC at least, the FPU version will likely be faster.
2) when done right, you can do this with SSE and work on several pixels at a time.
Fixed point has it's uses, but only if you can guarantee that all values and all intermediates will be "reasonably close" to 0.0 (or whatever actual value happens to be your zero-point) and you have enough precision/bits in both the integer part and the decimal part. I have seen more than one implementation of fixed point that failed because the programmers failed to account for the fact that intermediate numbers in the calculation overflowed/underflowed.
You want your scaling to be a power of 2 so you can use shifts to scale instead of multiplications/divides.
-
February 21st, 2013, 11:50 AM
#4
Re: Bilinear Resampling Probelm
Hey again, all finished with a bit of time spent. I think the result is quite acceptable given that it's all integer arithmetic. I'm not an asm programmer but I had some fun trying to port it to MMX. It's a bit all over the place as I was just playing around, so it's not production worthy code, however I thought I'd post it here for people to have a look at. In my tests in can upsample a 32 x 32 pixel image to 240 x 240 100 times in 0.08 seconds. That beats GDI Plus and the quality is acceptable, well, for my needs anyway. Here it is, have fun...
Code:
void InternalResample(unsigned int * src, int w1, int h1, unsigned int * dest, int w2, int h2)
{
const unsigned int dx = (w1 << 16) / w2;
const unsigned int dy = (h1 << 16) / h2;
const unsigned int RIGHT_BOUNDS = w1 - 1;
const unsigned int BOTTOM_BOUNDS = h1 - 1;
unsigned int ccy = 0;
unsigned int ccx = 0;
unsigned int ty = 0;
unsigned int ad1 = 0;
unsigned int ad2 = 1;
unsigned int ad3 = w1;
unsigned int ad4 = w1 + 1;
bool switch34 = false;
unsigned int *p_src = src;
unsigned int *m_pdest = dest;
__asm
{
mov esi, [p_src];
pxor mm2, mm2;
mov edi, [m_pdest];
pxor mm7, mm7;
}
for( int yy=0; yy<h2; ++yy )
{
ty = (ccy >> 16) * w1;
ccx = 0;
__asm
{
movd mm5, [ccy];
psrlw mm5, 8;
pshufw mm5, mm5, 0; // MM5 = DDV
}
for( int xx=0; xx<w2; ++xx )
{
__asm
{
movd mm4, [ccx];
psrlw mm4, 8;
pshufw mm4, mm4, 0; // MM4 = DDU
//////////////////////////////////////////////////////////////////
// MM6 = Color copy register;
////////////////////////////////////////////////////////////////////
mov eax, [ad1];
movd mm0, [esi + eax * 4]; // X
mov eax, [ad2];
movd mm3, [esi + eax * 4]; // X1
punpcklbw mm0, mm7; // X to Words AA-RR-GG-BB
punpcklbw mm3, mm2; // X1 to Words AA-RR-GG-BB
pshufw mm6, mm0, 0e4h; // Copy of X - XC.
pmullw mm3, mm4; // X1 = (X1 * DDU)
pmullw mm6, mm4; // XC = (XC * DDU)
psubw mm3, mm6; // Scaled X1 - Scaled XC
psrlw mm3, 8 // (Scaled X1 - Scaled XC) / 256
paddb mm0, mm3; // X = (X + ((X1 * DDU) - (XC * DDU) / 256)
/////////////////////////////////////////////////////////////////
// MM6 = Color copy register
///////////////////////////////////////////////////////////////////
mov eax, [ad3]
movd mm1, [esi + eax * 4]; // Y
mov eax, [ad4];
movd mm3, [esi + eax * 4]; // Y1 **
punpcklbw mm1, mm7; // Y to WORDS AA-RR-GG-BB
punpcklbw mm3, mm2; // Y1 to WORDS AA-RR-GG-BB **
pshufw mm6, mm1, 0e4h; // Copy of Y - YC
pmullw mm3, mm4; // Y1 = (Y1 * DDU) **
pmullw mm6, mm4; // YC = (YC * DDU)
psubw mm3, mm6;
psrlw mm3, 8;
paddb mm1, mm3; // Y = (Y + ((Y1 * DDU) / 256) - ((YC * DDU) / 256)
////////////////////////////////////////////////////////
pshufw mm6, mm0, 0e4h; // Copy of XI = XIC
pmullw mm1, mm5; // YI = YI * DDV
pmullw mm6, mm5; // XIC = XIC * DDV
psubw mm1, mm6;
psrlw mm1, 8;
paddb mm0, mm1; // XI = XI - XIC
packuswb mm0, mm7; // Pack Interpolated X.
movd [edi], mm0; // Write X Interpolated to Memory.
add edi, 4;
}
ccx += dx;
ad1 = ty + (ccx >> 16);
ad2 = ad1 + 1;
ad3 = ad1 + w1;
ad4 = ad2 + w1;
if ((ccx >> 16) >= RIGHT_BOUNDS)
{
--ad4;
--ad2;
}
if (switch34)
{
ad4 -= w1;
ad3 -= w1;
}
} // xx
ccy += dy;
if ((ccy >> 16) >= BOTTOM_BOUNDS)
switch34 = true;
else
switch34 = false;
} // yy
__asm emms;
}
Last edited by vis781; February 21st, 2013 at 11:59 AM.
-
February 21st, 2013, 11:53 AM
#5
Re: Bilinear Resampling Probelm
Apologies, I meant to say that it can upsample a 32x32 image to 240x240 100 times in 0.08 secs.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|