counting consecutive null characters

Printable View

Show 50 post(s) from this thread on one page

April 15th, 2010, 09:13 AM
superbonzo

Re: counting consecutive null characters

taken by a sudden spirit of rebellion I reread the paragraph on for_each and it (implicitly) allows side effects on its functor argument... mah... :confused:

therefore the following code should be standard correct

Code:

int main() { char data[] = {0, 1, 2, 3, 0, 0, 4 ,5 ,0, 0, 0, 6, 7}; int count = 0; int largest = 0; std::for_each( data, data + 13, [&]( char value ) { if( value ) count = 0; else largest = std::max( largest, ++count ); } ); }

a presumably correct alternative with accumulate would be

Code:

int main() { char data[] = {0, 1, 2, 3, 0, 0, 4 ,5 ,0, 0, 0, 6, 7}; struct MyCounter { MyCounter(): value(0), count(0) {} MyCounter( int value, int count ): value(value), count(count) {} MyCounter operator+( char a_char ) { if( a_char ) count = 0; else ++count; return MyCounter( std::max( value, count ), count ); } int value; int count; }; MyCounter count = std::accumulate( data, data + 13, MyCounter() ); }
April 15th, 2010, 09:38 AM
hoxsiew

Re: counting consecutive null characters

Why all the hoopla for such simple task:

Code:

int main() { #define BUFSZ 13 int CurRun=0,MaxRun=0; unsigned char buf[BUFSZ]={0,1,2,3,0,0,4,5,0,0,0,6,7}; for(size_t x=0;x<BUFSZ;x++){ buf[x]?(CurRun>MaxRun?MaxRun=CurRun:0,CurRun=0):CurRun++; } printf("Max: %d\n",MaxRun); return 0; }

(off the top of my head. It could probably be reduced even more).
April 15th, 2010, 09:52 AM
laserlight

Re: counting consecutive null characters

Quote:

Originally Posted by hoxsiew

Why all the hoopla for such simple task:

That looks like roughly the same algorithm as what JohnW@Wessex proposed in post #5, except that your implementation is more obfuscated. The posts in between have been attempts to answer (and discuss answers to) the original question by proposing things in the standard library beyond std::string and std::max.
April 15th, 2010, 02:13 PM
VladimirF

Re: counting consecutive null characters

Quote:

Originally Posted by aryan1

Performance is my first priority while implementing this task.

Looks like this thread departed from stated “first priority”.
There are some obvious algorithmic optimizations to this process.
For example, in JohnW@Wessex’s code when you find a beginning of sequence of zeroes, you can first check if an element at *(begin + maxCount) position is 0; if it is – keep checking for all zeroes, but if it isn’t – you can adjust your current position to (begin + maxCount) skipping maxCount elements.
Then, for larger maxCount, you could replace byte-by-byte compare with 4-bytes int compare to 0 (taking care of alignment, of course). Depending on expected maxCount, this could be very beneficial.
If the performance's priority is really high, we could keep going…
April 15th, 2010, 03:03 PM
monarch_dodra

Re: counting consecutive null characters

Quote:

Originally Posted by VladimirF

Looks like this thread departed from stated “first priority”.
There are some obvious algorithmic optimizations to this process.
For example, in JohnW@Wessex’s code when you find a beginning of sequence of zeroes, you can first check if an element at *(begin + maxCount) position is 0; if it is – keep checking for all zeroes, but if it isn’t – you can adjust your current position to (begin + maxCount) skipping maxCount elements.
Then, for larger maxCount, you could replace byte-by-byte compare with 4-bytes int compare to 0 (taking care of alignment, of course). Depending on expected maxCount, this could be very beneficial.
If the performance's priority is really high, we could keep going…

A binary search implementation of the "find the longest chain of a item" algorithm. Sounds like a challenge!
April 15th, 2010, 03:11 PM
nuzzle

Re: counting consecutive null characters

Quote:

Originally Posted by aryan1

what I want is to obtain the size of the "largest" sequence of consecutive NULL bytes,

Isn't an optimization possible here?

Once you've found a null sequence of length N you only have to check whether every N+1'th byte is null after that (but when you do find a null you need to look both backwards and forwards to establish the length of that particular null sequence). Still the longer the null sequence you already have the faster it becomes to check the rest of the bytes because you can increase the stride.
April 15th, 2010, 03:17 PM
Lindley

Re: counting consecutive null characters

Quote:

Originally Posted by JohnW@Wessex

That looks a bit over complicated. How about...

Code:

#include <string> #include <algorithm> int checkConNullBytes(const std::string &dataStream) { int maxCount = 0; int count = 0; std::string::const_iterator begin = dataStream.begin(); std::string::const_iterator end = dataStream.end(); while (begin != end) { if (*begin == 0) { ++count; maxCount = std::max(maxCount, count); } else { count = 0; } ++begin; } return maxCount; } int main() { char data[] = {0, 1, 2, 3, 0, 0, 4 ,5 ,0, 0, 0, 6, 7}; std::string dataStream(data, data + 13); int largest = checkConNullBytes(dataStream); }

If you expect the string to have mostly 0s, you should put the std::max call in the else rather than the if (prior to resetting count). If you expect the 0s to be fairly rare, it'll be faster as-is.
April 15th, 2010, 03:21 PM
laserlight

Re: counting consecutive null characters

Quote:

Originally Posted by nuzzle

Once you've found a null sequence of size N you only have to check whether every N+1'th byte is null after that (but when you find a null you need to look both backwards and forwards to establish the size of that particular null sequence). Still the larger the null sequence you already have the faster it becomes to check the rest of the bytes because you can increase the stride.

That has the taste of Boyer-Moore :)
April 15th, 2010, 03:26 PM
VladimirF

Re: counting consecutive null characters

Quote:

Originally Posted by monarch_dodra

A binary search implementation of the "find the longest chain of a item" algorithm. Sounds like a challenge!

I keep hinting to a programming challenge for awhile, with no support. Maybe to subtle?
If we only had sizeable data file with bunch of zeroes in it for benchmark… Would anybody accept such challenge?
April 15th, 2010, 03:29 PM
VladimirF

Re: counting consecutive null characters

Quote:

Originally Posted by nuzzle

Isn't an optimization possible here?

Once you've found a null sequence of length N you only have to check whether every N+1'th byte is null after that (but when you find a null you need to look both backwards and forwards to establish the length of that particular null sequence). Still the longer the null sequence you already have the faster it becomes to check the rest of the bytes because you can increase the stride.

I am trying to find the difference between your suggestion an my post # 19 above. Don’t see it :)
April 15th, 2010, 03:39 PM
monarch_dodra

Re: counting consecutive null characters

Quote:

Originally Posted by VladimirF

I keep hinting to a programming challenge for awhile, with no support. Maybe to subtle?
If we only had sizeable data file with bunch of zeroes in it for benchmark… Would anybody accept such challenge?

Why not? I'm already doing it anyways.

Actually, I'm implementing a different solution for each iterators type. You wouldn't want to skip too far ahead if you have input iterators...
April 15th, 2010, 04:27 PM
nuzzle

Re: counting consecutive null characters

Quote:

Originally Posted by VladimirF

I am trying to find the difference between your suggestion an my post # 19 above. Don’t see it :)

There were so many replies I just browsed them before I replied. But now when I look at your suggestion I can see it's based on the same idea.

One difference thought seems to be that you make a longer jump forward only after you've found a zero. In my version you switch to N+1 increments for the remainder of the search when you've found a sequence of length N. If you have found a null sequence of 10 bytes you only need to check every 11'th byte ever after.
April 15th, 2010, 04:37 PM
Lindley

Re: counting consecutive null characters

Hmm, except that if you *do* find a null on one of those jumps, you'll need to look backward to find the start of the sequence. Not a problem for bidirectional iterators, but for forward iterators it would be.
April 15th, 2010, 04:55 PM
nuzzle

Re: counting consecutive null characters

Quote:

Originally Posted by Lindley

Hmm, except that if you *do* find a null on one of those jumps, you'll need to look backward to find the start of the sequence.

Yes I mentioned that in #21.

Quote:

Not a problem for bidirectional iterators, but for forward iterators it would be.

With a little effort you can make almost everything not work.
April 15th, 2010, 04:58 PM
Lindley

Re: counting consecutive null characters

I suppose for forward iterators you're stuck looking at everything anyway, so you may as well do a simpler algorithm. Bidirectional would depend on the relative speed of conditionals versus traversal. It's a good optimization for random access iterators though.

Show 50 post(s) from this thread on one page

All times are GMT -5. The time now is 10:22 AM.