Click to See Complete Forum and Search --> : How can I get a bit's position in the int?


roadrunner17
September 12th, 2002, 09:45 AM
Is there a way to get a bit's position in a datatype? For example, if int var1 = 64 how can I know that it is in position 7, or if var1 = 16 that it is in position 5, or any position in the 32 bit word (there will only be 1 bit set at a time)?

I would like to get this preferably without calling a squareroot function (hopefully using just bit operators). Computing this index needs to be as fast as possible as I will have to do this countless times on 384 bits at a time.

Thanks for your help,

Yves M
September 12th, 2002, 10:18 AM
Is there always only one bit set in your int ?

A kind of easy way to do this would be to use the following (it is probably not the fastest, but can be OK if the compiler optimizes properly).


int var1;
int i;

i = 0;
while (var1 != 0) {
var1>>= 1;
i++;
}


Then i will be the bit position + 1 or 0 if var1 was zero already.

There is definitely a way to do this faster in assembler, but I can't remember all the instructions :/
(something like sar ax, 1 and then test for overflow)

roadrunner17
September 12th, 2002, 10:37 AM
Thanks for the reply, I thought about doing this but it involves potentially looping 32 times for each int.

In this particular function I'm trying to write, I like to get the position of all the bits set with an int, i perform a bit operation that will isolate each bit 1 at a time. The loop that does this will execute once for each bit set.

I guess shifting in a loop is the only way to do this. Is there a way to use a lookup table, other than creating an array of 4294967295 values. :-)


Thanks again.

Yves M
September 12th, 2002, 10:44 AM
Ok, here is some "optimized" assembler code for doing this (x86):
eax contains var1 and ecx contains i and hence the result. [I updated this to eliminate one test]

xor ecx, ecx
mov eax, var1
test eax, eax
je label_end
label_loop:
inc ecx
sar eax, 1
jnz label_loop
label_end:
mov i, ecx


What are you trying to do with the bits that are set ? Most of the answer will depend on this....

Say you want to set booleans... Then you can use the & operator and a lot of if statements (or triadics)...

bBitOne = (var1 & 0x00000001 ? TRUE : FALSE);
bBitTwo = (var1 & 0x00000002 ? TRUE : FALSE);
...


Or if you have the booleans in an array, you could use a loop

int i;
unsigned long ulbitpos:

ulbitpos = 1;
for (i = 0; i < 32; i ++) {
bBit[i] = (var1 & ulbitpos ? TRUE : FALSE);
ulbitpos <<= 1;
}

Yves M
September 12th, 2002, 10:48 AM
Originally posted by roadrunner17
Thanks for the reply, I thought about doing this but it involves potentially looping 32 times for each int.


Well that would not really be that slow :/

The assembler version takes 5785 microseconds for 65536 times on my machine.

Unrolling the loop brings that down to 4010 microseconds :)

Nikolai Borissov
September 12th, 2002, 06:03 PM
Here is a solution without loop. Hope it will be faster, but you should verify anyway.

inline int get_int_1_pos(int i)
{ int n=-1;
if(i&0xffff0000) {n+=16;i>>=16;};
if(i&0x0000ff00) {n+=8;i>>=8;};
if(i&0x000000f0) {n+=4;i>>=4;};
if(i&0x0000000c) {n+=2;i>>=2;};
return n+i;
}

Yves M
September 12th, 2002, 06:05 PM
sneaky, the old divider and conquer trick ;)

Oh, and yes it will definitely be much faster than my loops ;)

AnthonyMai
September 13th, 2002, 01:16 AM
Yves M:

Do NOT show off any assembly code if you don't know assembly code. This won't help you pass any job interview :-)

Your "assembly" code easily takes 100 times more CPU cycles than it needs to get the result the OP wants. Then, what's your point in writting assembly code?

To find position of the first none zero bit of a 32 bits integer, counting either from left hand (MSB) or right hand(LSB), there are two single assembly instruction for this purpose:

BSF
or BSR

Now, as a homework assignment, go figure out how to do it using BSF or BSR. Keep in mind, no more than 3 lines of code.

cup
September 13th, 2002, 05:58 AM
BSF/BSR/BTC/BTR only exist from 386 onwards. Those of us, like myself who use geriatric 286s on DOS are unable to use these instructions.:)

Yves M
September 13th, 2002, 06:09 AM
AnthonyMai,

DO NOT CRITICIZE when you don't have a valid reason for doing it. I have not stated that I'm an assembly guru.

The point you make about using BSF or BSR is valid, so just say that. Anyways, it's no likely that I'll ever want a job as an assembly programmer anyways ;-)

cup, if I were using a 286, I couldn't use eax et al ;) (I guess, or is a 286 also 32 bit ?)

cup
September 13th, 2002, 06:19 AM
Yes EAX is also 32 bit and 386 upwards.

286s are strange beasties. The PCs only ever used them in 8086 mode so the 286 mode was never exploited except in RMX, which is an Intel OS.

386 added a lot to the instruction set: bit arrays, modifers etc. It even has stuff for adding breakpoints in ROM.

Yves M
September 13th, 2002, 07:27 AM
And just exactly why do have to keep on using a 286 ? :confused:

cup
September 13th, 2002, 09:35 AM
Home project - it is a control system that waters the garden and controls the air flow in the greenhouse. Nobody will pinch an old 286: you can leave it in the shed for years and know it is perfectly safe.

Yves M
September 13th, 2002, 09:38 AM
Good point :)

[edit :

For roadrunner, here is the assembler code that finds the least significant bit set in var1


BSF ecx, var1
jnz label_found
mov ecx, 0xFFFFFFFF
label_found:
mov i, ecx


Sorry, Anthony, you'll have to show me how to do it in three lines of code though ;)

I guess you are thinking about something along the lines of :

mov ecx, 0xFFFFFFFF
BSF ecx, var1
mov i, ecx

But according to my assembler reference, if var1 is zero, then the contents of ecx are undefined :/ On my processor it seems to work though (Pentium III mobile), but I'm not sure that you can rely on the fact that ecx is not modified.

Oh, and yes, timing this gives me 270 microseconds for 65536 times. So that's a decent improvement :)
]

AnthonyMai
September 13th, 2002, 10:47 AM
DO NOT CRITICIZE when you don't have a valid reason for doing it. I have not stated that I'm an assembly guru.

The point you make about using BSF or BSR is valid, so just say that. Anyways, it's no likely that I'll ever want a job as an assembly programmer anyways ;-)


Sorry for being a bit hash on you, but had you not criticize me before in no good way for no good reason, I would have said things nicer. Compare with other people, at least you are willing to write SOME assembly code some times.

I have not stated I am an assembly guru either. And I surely do not want a job as an assembly programmer. And surely I don't think there are still any pure assembly programming job around. But every C++/C programmer needs to know the correct way of how to speed things up here and there using the right assembly, when the needs raise. And raises often the needs indeed.

Controling garden hoses and air conditioning like CPU do does not require much CPU power so a 286 is more than needed. But since the OP wants somethiing as fast as possible I would assuming he is no longer using 286. I even have a pretty good guess exactly what his application is and why he needs speed.


But according to my assembler reference, if var1 is zero, then the contents of ecx are undefined :/ On my processor it seems to work though (Pentium III mobile), but I'm not sure that you can rely on the fact that ecx is not modified.


"Undefined" means it's never touched so it was left at an undefined value, meaning whatever ecx was before, it was left unchanged. So, yes, you can count on it that ecx is not modified if bsf or bsr find no set bit.

The original OP wants to determine the bit in a 384 bit data structure, not just a 32 bit integer. You should be able to do it in 6 lines of assembly code. I will let you pass the job interview if you do this homework correctly :-)

Paul McKenzie
September 13th, 2002, 11:20 AM
Compare with other people, at least you are willing to write SOME assembly code some timesI'll forgive Yves for posting the original code, but I can't let your statements go without being answered.

In the context of a C++ forum, assembly language would be off-topic. The last thing you need is a C++ forum turning into an Intel assembly language forum. When the moderators start an "Assembly Language" forum, then showing assembly code would be "on-topic".
But every C++/C programmer needs to know the correct way of how to speed things up here and there using the right assembly, when the needs raise. And raises often the needs indeed.Ever hear of cross-platform? Yeah, my little speed up using 80x86 will do wonders on the iMac. Note that this is just your opinion. Also, the OP didn't mention what processor they are using. It could be a Motorola 68K for all you know.

It doesn't really matter -- turning the forum over to assembly language is what you would like to see, and hopefully the moderators will step in and make sure it doesn't happen.

Regards,

Paul McKenzie

PaulWendt
September 13th, 2002, 02:25 PM
Originally posted by AnthonyMai
But every C++/C programmer needs to know the correct way of how to speed things up here and there using the right assembly, when the needs raise. And raises often the needs indeed.


****, Mai. You and I agree, but not completely. I, too, will resort
to assembly if I have to. I'm not good at it, but I can sometimes
make some very time-critical code faster because of it. So I do
what a lot of people do and hide that code in a function or
whatever. That works all well and good ... for my x86 platform.

What about my other platforms, though? We have to support
hp-ux, aix, sco, osf/digital unix [or whatever they call themselves],
redhat linux, and we're winding up support for VMS and MPE.
With these many platforms to support, you may notice diminishing
returns when trying to optimize parts of your code. That's why
I'm a strong advocate of McKenzie's "if it can be sped up by using
the standard, then it's a great solution" philosophy.

You're great when it comes to writing performance-critical code.
That's only a part of the puzzle, though. You lose points with
everyone around here because you get up on the soap box and
preach about how performance is king. I'm not about to go out
and learn assembly for all of these platforms. Granted some share
processors, but assembly is only one way to optimize outside the
standard. Using os-specific functions is another way.

So, you're great in your niche, but don't be too preachy to others
because not everyone is in the same position that you are.

--Paul

AnthonyMai
September 13th, 2002, 02:33 PM
Paul:

What's your point? This is a C++ board, but this board does not exactly censor or ban the _asm keyword. Beside it is part of the standard C++ to provide the inline assembly through the _asm keyword. A compiler has to be able to deal with the _asm keyword to be considered C++ complaint.

And what do you have to add to this thread? The OP wanted a solution and he wanted it as fast as possible. I provided the solution. And what do you have to provide? Tell the OP to do it using template?

Not every one needs cross-platform. And frankly, you shouldn't even started to talk about porting it to one million other platforms, when you don't even have a working solution for the current platform you work with.

You should postpone worry about porting it to iMac until the time you really need to port to iMac.

Andreas Masur
September 13th, 2002, 02:50 PM
Originally posted by AnthonyMai
And what do you have to add to this thread? The OP wanted a solution and he wanted it as fast as possible. I provided the solution. And what do you have to provide? Tell the OP to do it using template?
Well...then I must have missed something...because in none of your postings to this thread you provided the solution. You threw in some nice words like 'BSF' and 'BSR' but that was all...the solutions were provided by others...

Blame nobody else for something that you did not even fulfilled yourself...

AnthonyMai
September 13th, 2002, 02:56 PM
What about my other platforms, though? We have to support
hp-ux, aix, sco, osf/digital unix [or whatever they call themselves],
redhat linux, and we're winding up support for VMS and MPE.
With these many platforms to support, you may notice diminishing
returns when trying to optimize parts of your code. That's why
I'm a strong advocate of McKenzie's "if it can be sped up by using
the standard, then it's a great solution" philosophy.


I believe optimization is necessary only when it is necessary. When performance doesn't matter, it doesn't matter and there is no point optimizing it using assembly.

But I also believe with the current trend of the IT industry, more and more software development projects are becoming performance critical, and bottlenecks must be optimized to the best performance possible. If it can be optimized within C++ and it is good enough, you stay with C++, if that is not good enough, you go to C, if the best solution in C is still not good enough, you do optimization using assembly, if assembly is still not enough, you hand tune it using special instructions like MMX etc. If the best assembly code is still not good enough, you try to optimize it in hardware, i.e., invent special purpose processors which can do certain things much faster than regular, general purpose processors.

If you have many platforms to support, you have to sopport that many platforms. If performance is critical in each platform, you have to do your best to optimize for each platform. If you need assembly to optimize a MIPS based platform to get the desired performance, and no one in your group knows MIPS, then you've got to learn or hire some one who knows. If you can't hire any one and no one is willing to learn, then you have to drop your support for MIPS based system. It is that simple.

It's business. Business goes where money can be made. Business do NOT stop at where people lack the knowledge or are not willing to learn. Knowledgeable and skillful people can always be hired and people lacking the necessary skill can always be fired. That's how business operates.

Yves M
September 13th, 2002, 03:07 PM
Paul,

Thanks for letting my assembly code pass ;) I guess the point in this example is that it would be very inefficient on every machine to make this code portable. Many "portable" programs include a few functions which are platform dependent, so I don't think everything necessarily has to be platform independent.

I wonder how you would solve the OP's question in a portable way that's fast :confused: Nikolai's way is OK, but it's not quite as fast as using the capaibilities of the machine.

[the best I could come up with for the 384 bits is this :

lea edx, var1
sub edx, 4
mov ecx, 12
xor eax, eax
label_comp:
add edx, 4
BSF eax, [edx]
loopz label_comp
jz label_end
mov ebx, 11
sub ebx, ecx
shl ebx, 5
add eax, ebx
label_end:
mov i, eax

]

AnthonyMai
September 13th, 2002, 03:21 PM
Yves:
I can rate your homework as 60, out of 100 point scale.
Try this for 384 bits:


#include <stdio.h>
#include <memory.h>

typedef struct MYSTRUCT
{
unsigned int data[12];
} MYSTRUCT;

int main(int argc, char* argv[])
{
MYSTRUCT myData;
int nBitPos = 0;

memset(&myData, 0, sizeof(myData));

myData.data[11] = 2;

_asm {
mov eax, -1
mov ecx, 12
loop_bits:
bsf eax, [myData+ecx*4-4]
loopz loop_bits
shl ecx, 5
add eax, ecx
mov nBitPos, eax
}
// nBitsPos is now the bit position. -1 if bit never found.
}

Paul McKenzie
September 13th, 2002, 03:32 PM
Originally posted by AnthonyMai
Paul:

What's your point? This is a C++ board, but this board does not exactly censor or ban the _asm keyword. Beside it is part of the standard C++ to provide the inline assembly through the _asm keyword. A compiler has to be able to deal with the _asm keyword to be considered C++ complaint.

And what do you have to add to this thread? The OP wanted a solution and he wanted it as fast as possible. I provided the solution. And what do you have to provide? Tell the OP to do it using template?

Not every one needs cross-platform. And frankly, you shouldn't even started to talk about porting it to one million other platforms, when you don't even have a working solution for the current platform you work with.

You should postpone worry about porting it to iMac until the time you really need to port to iMac. What a roundabout load of nonsense this is. The forum discussses C++, not assembly language. __asm is not assembly language.

If you intend to make a C++ forum an assembly language forum, you are not abiding by what this forum is all about. If you don't like it, go to the moderators and have them start an assembly language forum. One thing myself and hopefully others will not let you get away with, and that is hijacking a C++ forum and posting assembly language as solutions.

Regards,

Paul McKenzie

AnthonyMai
September 13th, 2002, 04:32 PM
One thing myself and hopefully others will not let you get away with, and that is hijacking a C++ forum and posting assembly language as solutions.


posting assembly language as solutions is not the issue. Obviously when some one ELSE post assembly code, it was never an issue with you. And it only becomes an issue when I do the same. Paul, you have an agenda against me. That is the issue. You sweared countless times to ignore me but every time you jump right back in to argue with me, even when you have nothing else to add to the solution.

I normally do NOT post assembly code unless assembly code is the only solution. Back when I posted Qsort, did I use any assembly code so that I can boast a faster speed?

roadrunner17
September 13th, 2002, 04:38 PM
Originally posted by Yves M

What are you trying to do with the bits that are set ? Most of the answer will depend on this....


I'm writing a bit class that will hold a position of items in a 2d array. It contains an array of 12 integers. Within another class there will be an array of 238 of this bitset class. I plan to use the normal set operations (union, intersection, diference) on various combinations of these arrays of sets and then traverse through the bits in each set (which represents a location in a 2d array).

The 32 bit word can have more than 1 bit set, but I perform an operation that will isolate each bit (it will loop once for each bit set in the int) and couldn't figure out an effecient way to determine it's position. It seems that using BSF is the fastest way.

I'm a little rusty on PC assembler, I have a question... you don't have to save the contents of register EAX before mov(ing) a value to it, or will the compiler take care of saving EAX if it needs to?

Thanks again for all the responses,

Paul McKenzie
September 13th, 2002, 04:46 PM
Originally posted by AnthonyMai


posting assembly language as solutions is not the issue. Obviously when some one ELSE post assembly code, it was never an issue with you. And it only becomes an issue when I do the same. Paul, you have an agenda against me. That is the issue. You sweared countless times to ignore me but every time you jump right back in to argue with me, even when you have nothing else to add to the solution.

I normally do NOT post assembly code unless assembly code is the only solution. Back when I posted Qsort, did I use any assembly code so that I can boast a faster speed? You brought this on yourself:
Compare with other people, at least you are willing to write SOME assembly code some timesI responded that in the context of a C++ forum, assembly language answers are off-topic. When a forum allows too many "off-topic" answers, it ceases to become a forum.

Regards,

Paul McKenzie

stober
September 13th, 2002, 06:09 PM
In all your postings, did anyone think of doing it the easy way, like this:

struct bits
{
unsigned long bit_1:1;
unsigned long bit_2:1;
unsigned long bit_3:1;
unsigned long bit_4:1;
unsigned long bit_5:1;
unsigned long bit_6:1;
unsigned long bit_7:1;
unsigned long bit_8:1;
unsigned long bit_9:1;
unsigned long bit_10:1;
unsigned long bit_11:1;
unsigned long bit_12:1;
unsigned long bit_13:1;
unsigned long bit_14:1;
unsigned long bit_15:1;
unsigned long bit_16:1;
unsigned long bit_17:1;
unsigned long bit_18:1;
unsigned long bit_19:1;
unsigned long bit_21:1;
unsigned long bit_22:1;
unsigned long bit_23:1;
unsigned long bit_24:1;
unsigned long bit_25:1;
unsigned long bit_26:1;
unsigned long bit_27:1;
unsigned long bit_28:1;
unsigned long bit_29:1;
unsigned long bit_30:1;
unsigned long bit_31:1;
unsigned long bit_32:1;
};

typedef union
{

struct bits bits;
unsigned long word;
} WORDBITS;


int main(int argc, char* argv[])
{
WORDBITS something;
something.word = 0;
something.bits.bit_10 = 1;
return 0;
}