Re: how to detect NULL bytes in a char array ?
If you have to find whether there is a NULL.
You could do soemthing like the following..
char buffer[1000000] = {'\0'};
....
bool detected = strlen(buffer) < 1000000;
I think that initialization is not a costly affair.
Re: how to detect NULL bytes in a char array ?
Whichever method you choose, under the hood it's going to look at one character at a time looking for NULL. There's no other way to do it. Your original for loop is going to be just as efficient as anything else.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
JohnW@Wessex
Point taken, but I worked on the assumption that this is a very simplified example to demonstrate the algorithm for finding nulls, not robust file handling.
Fair enough - your solution did answer what the OP was asking for. I just thought I'd point it out though... just for completeness.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
GCDEF
Whichever method you choose, under the hood it's going to look at one character at a time looking for NULL. There's no other way to do it. Your original for loop is going to be just as efficient as anything else.
Oh, I'm sure I could come up with something less efficient if I tried....
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
Lindley
Oh, I'm sure I could come up with something less efficient if I tried....
I'm sure you could, but I doubt any of the other methods mentioned would be any more efficient that the OP for loop, which IIRC is what is being asked here.
Re: how to detect NULL bytes in a char array ?
Seriously though, while probably no better in terms of straight efficiency, there are good arguments for preferring std::find() to a hand-written loop. For one thing, it more explicitly self-documents what you're trying to do.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
Lindley
Seriously though, while probably no better in terms of straight efficiency, there are good arguments for preferring std::find() to a hand-written loop. For one thing, it more explicitly self-documents what you're trying to do.
I guess, although the original loop is about as basic as it gets.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
Lindley
Seriously though, while probably no better in terms of straight efficiency, there are good arguments for preferring std::find() to a hand-written loop. For one thing, it more explicitly self-documents what you're trying to do.
Well, what with the up coming new micro-threading libraries, there are also better chances for a "find" to be threaded than a simple for.
There are so many little optimizations possible under the hood with stl algorithms, it is really a good idea to use them.
For example:
copy will call memcopy if objects are POD
inplace_merge will attempt to create a buffer for moving stuff around
sort... will just plain be efficient.
etc.
Even if your for loop gets as dumb as it gets, there are always micro (and not-so-micro) gains.
In the case of OP, he would have read the ENTIRE buffer, even though found is equal to true. find would have aborted on the first null, at no extra branching cost.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
GCDEF
I'm sure you could, but I doubt any of the other methods mentioned would be any more efficient that the OP for loop, which IIRC is what is being asked here.
Right, the methods mentioned in this thread might not be more efficient.
And for just 1,000,000 bytes there might be no point of talking about efficiency at all.
But with 200,000,000 we are getting into “measurable” time – about 125ms.
Looking at this as a sport (or academic challenge), we can save 10% by replacing byte compare with int bitmask tests:
Code:
int* p = (int*)pBuffer;
//check if buffer has any NULL characters
for(int i = 0; i < arr_size/4; i++, p++)
{
if(!(*p & 0xFF000000))
count++;
if(!(*p & 0x00FF0000))
count++;
if(!(*p & 0x0000FF00))
count++;
if(!(*p & 0x000000FF))
count++;
}
Further micro-optimization (like loop unrolling) made no measurable difference. Looks like optimizing compiler somehow knew that it wouldn’t :)
There are still two ways how this code can be improved (in terms of efficiency): use multiple threads and utilize SSE.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
VladimirF
Right, the methods mentioned in this thread might not be more efficient.
And for just 1,000,000 bytes there might be no point of talking about efficiency at all.
But with 200,000,000 we are getting into “measurable” time – about 125ms.
Looking at this as a sport (or academic challenge), we can save 10% by replacing byte compare with int bitmask tests:
Code:
int* p = (int*)pBuffer;
//check if buffer has any NULL characters
for(int i = 0; i < arr_size/4; i++, p++)
{
if(!(*p & 0xFF000000))
count++;
if(!(*p & 0x00FF0000))
count++;
if(!(*p & 0x0000FF00))
count++;
if(!(*p & 0x000000FF))
count++;
}
Further micro-optimization (like loop unrolling) made no measurable difference. Looks like optimizing compiler somehow knew that it wouldn’t :)
There are still two ways how this code can be improved (in terms of efficiency): use multiple threads and utilize SSE.
Pretty sure it is IO bound anyways, given the origin is supposed to be an fstream. No matter what you do, it'll be starved for data.
If all the data is placed in memory before starting... Then I still think the program is IO bound to the slow memory (compared to what the processor does).
Maybe if the buffer was on the stack, but in this case, you can't have it big enough to be interesting.
My personal conclusion is that a good'ol loop (or std find/count) will get the job done at 99.99% max efficiency.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
monarch_dodra
My personal conclusion is that a good'ol loop (or std find/count) will get the job done at 99.99% max efficiency.
Not if you compare it with SSE. Check out this link, for example.
In my quick-and-dirty test I got this zeros counting in half the time (on in-memory data, of course).
Re: how to detect NULL bytes in a char array ?
That link is rather intimidating....
Still, good call on the memchr() function, I'd forgotten about that one. It does precisely what the OP wants, no fuss.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
GCDEF
Whichever method you choose, under the hood it's going to look at one character at a time looking for NULL. There's no other way to do it. Your original for loop is going to be just as efficient as anything else.
That's not quite true. It's possible to look for many characters at a time. Most processors today have multiple cores so it's possible to perform true concurrent searches.
Re: how to detect NULL bytes in a char array ?
Quote:
Originally Posted by
nuzzle
That's not quite true. It's possible to look for many characters at a time. Most processors today have multiple cores so it's possible to perform true concurrent searches.
Exactly. One thread per core will get you the best result.