performance in switch

**asivakumar** · March 2nd, 2004, 04:20 AM

Hello ,

I have a switch case statement .. as below

switch(data)
{
case 1:
.....
break;
case 2:
.....
break;
case 3:
.....
break;
default:
.....
break;
}

my data will be 2 ,70 % of times
or 1, 20 % of times
or 3 , 10 % of times ,

so if i change the order as below ,will it increase the performance ??

switch(data)
{
case 2:
.....
break;
case 3:
.....
break;
case 1:
.....
break;
default:
.....
break;
}

**nsh123** · March 2nd, 2004, 04:30 AM

Hi,
I dont think there is some standard for how the code is generated for a switch case stmt.

If you have a if else if block and if you keep the more probable conditions ahead then it surely helps.

Hope this helps

**AvDav** · March 2nd, 2004, 04:31 AM

I think yes, if your compiler supports asm code generation, see the asm file, (VC++ 6 has this feature)...

**gstercken** · March 2nd, 2004, 05:10 AM

Originally posted by asivakumar
so if i change the order as below ,will it increase the performance ??

That depends completely on the compiler implementation. The standard says nothing about how a switch statement will be compiled. One compiler might always generate a sequence of tests and branches, while most compilers will optimize a switch statement by using the case values for calculated jumps if possible - or a combination of both. So it is really compiler dependent, and you shouldn't rely on it.

**Fandu_Nagesh** · March 2nd, 2004, 05:39 AM

Hi,
I agree with gstercken. I feel that its mostly dependant on the compiler.
There are 2 different type of assemblies generated:

1. Most of the compiler generates IF-Else type assembly for Switch-Case statement.
In that case, you might get performance benifit, if you are able to put more probable "Case" at first in switch.
I mean if Case 3;s probability is more than others then put it first. Like.
Switch(...)
{
case 3:
case 2:
case 1:
}

2. If compiler generates offset based assembly for switch case statement. Then there will be same effort to reach to any case. (as direct address calculation happens).

I found that VC++ 6 compiler generates offset based assembly. So there will be same effort to reach to any case. i.e. no benifit to specific cases.

Hope so this helps
Regards
Nagesh

**gstercken** · March 2nd, 2004, 06:35 AM

Originally posted by Fandu_Nagesh
I found that VC++ 6 compiler generates offset based assembly. So there will be same effort to reach to any case. i.e. no benifit to specific cases.

Most compilers do that - but not generally. It depends on the case values. For example, for code like

Code:

switch(val)
{
  case 0:
  ...
  case 1:
  ...
  case 2:
  ...
  case 999:
  ...
  case 9999:
  ...
}

compilers a likely to generate a calculated jump for the first three branches, and handle the other two with tests and conditional branches.

**gstercken** · March 2nd, 2004, 07:30 AM

I did a quick test (VC++ 6.0, same results with and without optimizations turned on):

A switch with up to three consecutive case values like

Code:

case 0:
case 1:
case 2:

leads to individual tests for the single values:

Code:

0040184D   mov         ecx,dword ptr [ebp-4]
00401850   mov         dword ptr [ebp-0Ch],ecx
00401853   cmp         dword ptr [ebp-0Ch],0
00401857   je          main+47h (00401867)
00401859   cmp         dword ptr [ebp-0Ch],1
0040185D   je          main+50h (00401870)
0040185F   cmp         dword ptr [ebp-0Ch],2
00401863   je          main+59h (00401879)
00401865   jmp         main+60h (00401880)

With more than three consecutive case values, a jump table is generated, e.g.

Code:

case 0:
case 1:
case 2:
case 3:

compiles to:

Code:

0040184D   mov         ecx,dword ptr [ebp-4]
00401850   mov         dword ptr [ebp-0Ch],ecx
00401853   cmp         dword ptr [ebp-0Ch],3
00401857   ja          $L7389+7 (00401885)
00401859   mov         edx,dword ptr [ebp-0Ch]
0040185C   jmp         dword ptr [edx*4+401899h]

Now when we add two case values which are "out of range", individual tests are again generated for all of the cases:

Code:

case 0:
case 1:
case 2:
case 3:
case 999:
case 9999:

compiles to

Code:

0040184D   mov         ecx,dword ptr [ebp-4]
00401850   mov         dword ptr [ebp-0Ch],ecx
00401853   cmp         dword ptr [ebp-0Ch],3
00401857   jg          main+53h (00401873)
00401859   cmp         dword ptr [ebp-0Ch],3
0040185D   je          main+82h (004018a2)
0040185F   cmp         dword ptr [ebp-0Ch],0
00401863   je          main+67h (00401887)
00401865   cmp         dword ptr [ebp-0Ch],1
00401869   je          main+70h (00401890)
0040186B   cmp         dword ptr [ebp-0Ch],2
0040186F   je          main+79h (00401899)
00401871   jmp         main+9Bh (004018bb)
00401873   cmp         dword ptr [ebp-0Ch],3E7h
0040187A   je          main+8Bh (004018ab)
0040187C   cmp         dword ptr [ebp-0Ch],270Fh
00401883   je          main+94h (004018b4)
00401885   jmp         main+9Bh (004018bb)

Note however that the tests are already split up into two parts: For values greater than 3, and for the rest. Neitherless, no jump table is produced for the 4 consecutive values. Only if we add a fifth consecutive value, a jump table is again generated for 0..4, and the other two (999 and 9999) are tested individually:

Code:

case 0:
case 1:
case 2:
case 3:
case 4:
case 999:
case 9999:

compiles to

Code:

0040184D   mov         ecx,dword ptr [ebp-4]
00401850   mov         dword ptr [ebp-0Ch],ecx
00401853   cmp         dword ptr [ebp-0Ch],3E7h
0040185A   jg          main+55h (00401875)
0040185C   cmp         dword ptr [ebp-0Ch],3E7h
00401863   je          $L7390+9 (004018ad)
00401865   cmp         dword ptr [ebp-0Ch],4
00401869   ja          $L7390+19h (004018bd)
0040186B   mov         edx,dword ptr [ebp-0Ch]
0040186E   jmp         dword ptr [edx*4+4018D1h]
00401875   cmp         dword ptr [ebp-0Ch],270Fh
0040187C   je          $L7390+12h (004018b6)
0040187E   jmp         $L7390+19h (004018bd)

So you see, what the compiler actually generates depends on many factors, like the number of consecutive values, the "gaps" between the values etc. And of course, every compiler will employ its own strategy for optimizing switch statements.

Thread: performance in switch

Thread Tools

Display

performance in switch

Re: performance in switch

Switch Case Assembly

Re: Switch Case Assembly

Posting Permissions