Bilinear Resampling Probelm

**vis781** · February 19th, 2013, 10:23 AM

Hi all, I hope that somone here can help me out as I'm racking my brain over this and something is telling me that should be a realtively simple problem to solve but I just can't see the wood for the trees.

In short I'm converting a floating point bilinear image resampling routine into one that only uses fixed point arithmetic. I've gotten rid of nearly all the floats now, in fact all but one and the results at the moment are in distinguishable from the floating point version. It's a maths issue really so I hope that there's a whizz on here that can help. Some pseudocode goes like this.

Code:

for( int xx=0; xx<ow; xx++ )
{
   int_center = (ccx >> 16);
   int temp = xx * 2;

   for (j = int_center; j <= int_center + 1; j++)
   {
     t_weight = abs((int)(float)(((int)ccx - (j << 16))  * FILTER_FACTOR));
     int_weight = t_weight < 65536 ? 65536 - t_weight : 0;

     hv_pixel[temp + index] = j;
     int_hv_weight[temp + index] = int_weight;
     ++index;
   }/* j */

   ccx += dx;
   index = 0;
}

Where ccx is and integer error accumulator that gives me a scaled integer. Shifting down buy 16 gives me the relative pixel I need to be working on. The line just after where the inner loop begins is where I have the last remaining float. FILTER_FACTOR is essentially a percentage by which I scale the error accumulator to the correct amount.

For example.

ccx = 98303. Which is a value of 1.5 when shited down by 16 bits. Obviously I can shift it because it will round and I lose the precision. Lets say FILTER_FACTOR is 39321. Which is 60% of 1 (65535) So what I'd like to know is, is it possible to use the FILTER_FACTOR as an integer and do some fancy integer math to scale the result from (ccx - (j<<16)) by the representative amount that is FILTER_FACTOR. In this example 60%. Effectively getting 40% of (ccx - (j<<16)) At the moment FILTER_FACTOR is still a float and therefore 0.6, which of course works just fine.

Many thanks for all the replies in advance.

**Paul McKenzie** · February 19th, 2013, 02:36 PM

In short I'm converting a floating point bilinear image resampling routine into one that only uses fixed point arithmetic

How are you doing that when you're casting to float?

Code:

abs((int)(float)(((int)ccx - (j << 16))  * FILTER_FACTOR));

You've introduced floating point as soon as you casted.

Regards,

Paul McKenzie

**OReubens** · February 20th, 2013, 08:25 AM

1) WHY would yo convert from floating point to fixed point integer ? On a PC at least, the FPU version will likely be faster.
2) when done right, you can do this with SSE and work on several pixels at a time.

Fixed point has it's uses, but only if you can guarantee that all values and all intermediates will be "reasonably close" to 0.0 (or whatever actual value happens to be your zero-point) and you have enough precision/bits in both the integer part and the decimal part. I have seen more than one implementation of fixed point that failed because the programmers failed to account for the fact that intermediate numbers in the calculation overflowed/underflowed.

You want your scaling to be a power of 2 so you can use shifts to scale instead of multiplications/divides.

**vis781** · February 21st, 2013, 11:50 AM

Hey again, all finished with a bit of time spent. I think the result is quite acceptable given that it's all integer arithmetic. I'm not an asm programmer but I had some fun trying to port it to MMX. It's a bit all over the place as I was just playing around, so it's not production worthy code, however I thought I'd post it here for people to have a look at. In my tests in can upsample a 32 x 32 pixel image to 240 x 240 100 times in 0.08 seconds. That beats GDI Plus and the quality is acceptable, well, for my needs anyway. Here it is, have fun...

Code:

void InternalResample(unsigned int * src, int w1, int h1, unsigned int * dest, int w2, int h2)
{
  const unsigned int dx = (w1 << 16) / w2;
  const unsigned int dy = (h1 << 16) / h2;

  const unsigned int RIGHT_BOUNDS = w1 - 1;
  const unsigned int BOTTOM_BOUNDS = h1 - 1;

  unsigned int ccy = 0;
  unsigned int ccx = 0;
  unsigned int ty = 0;

  unsigned int ad1 = 0;
  unsigned int ad2 = 1;
  unsigned int ad3 = w1;
  unsigned int ad4 = w1 + 1;

  bool switch34 = false;
  

  unsigned int *p_src = src;
  unsigned int *m_pdest = dest;


  __asm 
  {
	  mov esi, [p_src];
	  pxor mm2, mm2;

	  mov edi, [m_pdest];
	  pxor mm7, mm7;
  }

  for( int yy=0; yy<h2; ++yy )
  {
	  ty = (ccy >> 16) * w1;
	  ccx = 0;

	  __asm
	  {
		  movd		mm5, [ccy];
		  psrlw		mm5, 8;
		  pshufw	mm5, mm5, 0;		// MM5 = DDV
	  }

	  for( int xx=0; xx<w2; ++xx )
	  {  
		  __asm
		  {
			  
			  movd		mm4, [ccx];
			  psrlw		mm4, 8;	

			  pshufw	mm4, mm4, 0;		// MM4 = DDU
			  //////////////////////////////////////////////////////////////////
			  // MM6 = Color copy register;
			  ////////////////////////////////////////////////////////////////////
			  

			  mov eax, [ad1];
			  movd		mm0, [esi + eax * 4];	// X

			  mov eax, [ad2];
			  movd		mm3, [esi + eax * 4];	// X1

			  punpcklbw mm0, mm7;				// X to	  Words			AA-RR-GG-BB
			  punpcklbw mm3, mm2;				// X1 to Words			AA-RR-GG-BB

			  pshufw    mm6, mm0, 0e4h;			// Copy of X - XC.	
			  pmullw	mm3, mm4;				// X1 = (X1 * DDU)
			  
			  pmullw	mm6, mm4;				// XC =  (XC * DDU)	
			  psubw		mm3, mm6;				// Scaled X1 - Scaled XC

			  psrlw		mm3, 8					// (Scaled X1 - Scaled XC) / 256
			  paddb		mm0, mm3;				// X = (X + ((X1 * DDU) - (XC * DDU) / 256)		


			  /////////////////////////////////////////////////////////////////
			  // MM6 = Color copy register
			  ///////////////////////////////////////////////////////////////////

			  mov eax, [ad3]
			  movd		mm1, [esi + eax * 4];		// Y

			  mov eax, [ad4];
			  movd		mm3, [esi + eax * 4];	// Y1 **

			  punpcklbw mm1, mm7;				// Y to WORDS		AA-RR-GG-BB
			  punpcklbw mm3, mm2;				// Y1 to WORDS		AA-RR-GG-BB **

			  pshufw    mm6, mm1, 0e4h;			// Copy of Y - YC
			  pmullw	mm3, mm4;				// Y1 = (Y1 * DDU) **

			  pmullw	mm6, mm4;				// YC = (YC * DDU)
			  psubw		mm3, mm6;

			  psrlw		mm3, 8;
			  paddb		mm1, mm3;				// Y = (Y + ((Y1 * DDU) / 256) - ((YC * DDU) / 256)


			  ////////////////////////////////////////////////////////

			  pshufw    mm6, mm0, 0e4h;			// Copy of XI = XIC
			  pmullw	mm1, mm5;				// YI = YI * DDV
			  
			  pmullw	mm6, mm5;				// XIC = XIC * DDV
			  psubw		mm1, mm6;

			  psrlw		mm1, 8;
			  paddb		mm0, mm1;				// XI = XI - XIC

			  packuswb	mm0, mm7;				// Pack Interpolated X.

			  movd		[edi], mm0;				// Write X Interpolated to Memory.
			  add		edi, 4;
		  }
		  ccx += dx;

		  ad1 = ty + (ccx >> 16);
		  ad2 = ad1 + 1;
		  ad3 = ad1 + w1;
		  ad4 = ad2 + w1;

		  if ((ccx >> 16) >= RIGHT_BOUNDS)
		  {
			  --ad4;
			  --ad2;
		  }

		  if (switch34)
		  {
			  ad4 -= w1;
			  ad3 -= w1;
		  }
		 
		} // xx
	  
	  ccy += dy;

	  if ((ccy >> 16) >= BOTTOM_BOUNDS)
		  switch34 = true;
	  else
		  switch34 = false;
		
		
  } // yy
  __asm emms;

}

**vis781** · February 21st, 2013, 11:53 AM

Apologies, I meant to say that it can upsample a 32x32 image to 240x240 100 times in 0.08 secs.

Thread: Bilinear Resampling Probelm

Thread Tools

Display

Bilinear Resampling Probelm

Re: Bilinear Resampling Probelm

Re: Bilinear Resampling Probelm

Re: Bilinear Resampling Probelm

Re: Bilinear Resampling Probelm

Posting Permissions