A better algorithm

**arka.sharma** · November 29th, 2013, 09:07 AM

Hi,

Sorry if this question has been asked before but I need some help on this as I am stuck on this for more than one year. The problem is as follows -
I have three arrays A,B,C of same size let say n and I need to find out all possible (i,j,k) such that A[i]+B[j]=C[k]. And we
need to find this in O(n^2) complexity for which I found one solution like
Step 1:
Add all A[i]+B[j] as the key and (i,j) as values in a hash table.This gives n^2 no of keys and values.
Step 2:
Traverse for all C[k] and probe the hash table.If mach found (i,j,k) is one pair and so on.
This looks like an O(n^2) solution.But can it be done better or at least O(n^2) in some different approach ?

Regards,
Arka

**razzle** · November 30th, 2013, 03:04 AM

Originally Posted by arka.sharma

This looks like an O(n^2) solution.

Say you instead insert the C values into the hash table. Then you generate the A+B sums in a pair of nested loops and check which sums are in the table. That's also O(n^2) but with a much smaller hash table.

And it has the additional advantage that you can now eat away on the O(n^2) complexity somewhat by executing the outermost loop in parallel. With n cores the complexity would reduce down to O(n).

Maybe sorting of the arrays could lower complexity but I'm not certain. At least it's a lead.

**arka.sharma** · December 2nd, 2013, 12:41 AM

Thanks for your reply and a nice suggestion to insert C values in hash table as this will require smaller hash table. But by better I mean less than O(n^2) let say O(n*logn) as your solution is also O(n^2)

Regards,
Arka

**razzle** · December 5th, 2013, 10:32 PM

Originally Posted by arka.sharma

as your solution is also O(n^2)

I know that and I said so. Still my solution gives a much smaller hash table and is much easier to parallelize, and that's not too bad.

I've given this another thought and considered sorting of the arrays as a way to reduce complexity but unfortunately I couldn't come up with anything. Sorting may improve speed but how much will be data dependent. It will dependend on the values stored in A, B and C so it's not a complexity reduction.

In principle sorting can accomplish that all A+B values that lie outside the range of the C values can be discarded in advance. This takes O(n * log n) to set up. But if all A+B values fall within the C range nothing is gained.

So the algorithm still is O(n*n) and to me it seems very much like that's the best that can be accomplished. I'll have to leave it at that.

**arka.sharma** · December 9th, 2013, 07:34 AM

One more advantage of inserting C[k] in hash table is also there will be no collision as none of the array contains duplicate on the other hand a[i] + b[j] for different (i,j) pair might produce same value and result into collision while inserting in hash table.So nice improvement. Thanks razzle.

**razzle** · December 9th, 2013, 08:32 AM

---

**superbonzo** · December 9th, 2013, 10:46 AM

regarding the optimal algorithm complexity, clearly you can always find a worst case input where the number of (i,j,k) such as ck=ai+bj scales as n^2, so, their mere enumeration implies an algorithm no better then O(n^2); regarding the avarage case instead, you should specify better the probability distribution of the input in order to give an answer.

anyway, assuming A and B sorted, non negative and without duplicates, an algorithm that doesn't require a hash table consists in traversing, for every k and starting from the biggest A[i]<=C[k], the strip of A[i]+B[j] closest to C[k]; this can be done in O(N) hence giving an overall complexity again of O(N^2).

BTW, if you're just interested in knowing, for every k, how many the (i,j) such as ck=ai+bj are, and you're willing to trade space for speed, then this can be computed via a convolution that, in turn, can be computed in O(nlogn) with the help of the FFT. This information could also be used to speed up the full O(N^2) algorithm when N becomes very big ...

**OReubens** · December 4th, 2013, 08:13 AM

If you are allowed to sort A, B and C then yes, you can maybe find this in fewer steps by eliminating impossible values in A, B and C.
Technically possible without sorting as well but the elimination phase will be more complex.

Additionally, by having tables sorted, you can early out on each possible step.

This'll be more complex, so it will only pay off for sufficiently large values of n.

**OReubens** · December 10th, 2013, 09:11 AM

You can reduce O(n²) to O((n-m)²) (with m being [0..n]) with a setup stage. With a large value of n, even a small value of m can make a huge difference in runtime.

the problem can't be further complexity reduced, but you can increase performance to less than the flat squareroot of n (or n-m)
Technically, since you can't guarantee a reduction is even possible the technical difficulty of this problem remains the original O(n²).

Just because a problem is O(anything) doesn't mean it's runtime is necessarily linear to that 'anything'. It gives you a rough guide/estimate to worst case execution time.

**superbonzo** · December 16th, 2013, 05:55 AM

flames and weak arguments .. this is how you react to any form of debate; I genuinely asked you to explain your idea of "avarage" and why it should supersede its more common statistical meaning, but your aversion to any form of perceived authority sadly won again.

**razzle** · December 27th, 2013, 07:28 PM

Originally Posted by superbonzo

flames and weak arguments .. this is how you react to any form of debate; I genuinely asked you to explain your idea of "avarage" and why it should supersede its more common statistical meaning, but your aversion to any form of perceived authority sadly won again.

Come on superbonzo.

I just don't buy your single minded idea that probability theory is reality. It isn't. It's just a mathematical tool. And it's not even always called for.

It became so obviously clear in the "escape the waterfront" discussion we had. It can be solved perfectly fine without any probablity theory at all. When you explain why it cannot and that probability theory is necessary I'll listen to you. And please leave that compass out of the explanation will you.

And even in this thread your strict adherance to probability theory has lead you wrong.

YOU CLAIMED THE SPECIFIC INPUT OF THE OP WOULD REVEAL THE AVERAGE COMPLEXITY OF THE ALGORITHM.

This is wrong regardless of the definition of average. You are seriously wrong here superbonzo. Simple as that.

Furthermore, what you call "modern" probability theory it just the same old probability theory only that it has overcome some of it's shortcomings. Please notify me when it has overcome all of its shortcomings will you!

**superbonzo** · December 28th, 2013, 09:56 AM

just for cusiosity sake, where did I even vaguely suggested that "probability theory is reality" ( whatever does this mean ) ? where did I wrote that the "specific input reveals the avarage complexity" ( whatever does this mean ) ? or that "'modern' probability theory [is not/it is/?] the same old probability theory" ( whatever does this mean ) ? and above all, why all this non sense is relevant to this discussion ?

and no, using capital letters will not make your arguments less weak. Frankly, I asked you twice to explain what your enlightened definition of "avarage" complexity is and how and why it could or should be made independent of any kind pf probabilitstic analysis, and the only answer I received has been a rhetorical misstatement of my own words ... if this is how you realize yourself then fine, no problem man, and happy new year !

Thread: A better algorithm

Thread Tools

Display

Hybrid View

A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Re: A better algorithm

Tags for this Thread

Posting Permissions