A better algorithm
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 21

Thread: A better algorithm

  1. #1
    Join Date
    Nov 2013
    Posts
    3

    A better algorithm

    Hi,

    Sorry if this question has been asked before but I need some help on this as I am stuck on this for more than one year. The problem is as follows -
    I have three arrays A,B,C of same size let say n and I need to find out all possible (i,j,k) such that A[i]+B[j]=C[k]. And we
    need to find this in O(n^2) complexity for which I found one solution like
    Step 1:
    Add all A[i]+B[j] as the key and (i,j) as values in a hash table.This gives n^2 no of keys and values.
    Step 2:
    Traverse for all C[k] and probe the hash table.If mach found (i,j,k) is one pair and so on.
    This looks like an O(n^2) solution.But can it be done better or at least O(n^2) in some different approach ?

    Regards,
    Arka

  2. #2
    Join Date
    Jul 2013
    Posts
    168

    Re: A better algorithm

    Quote Originally Posted by arka.sharma View Post
    This looks like an O(n^2) solution.
    Say you instead insert the C values into the hash table. Then you generate the A+B sums in a pair of nested loops and check which sums are in the table. That's also O(n^2) but with a much smaller hash table.

    And it has the additional advantage that you can now eat away on the O(n^2) complexity somewhat by executing the outermost loop in parallel. With n cores the complexity would reduce down to O(n).

    Maybe sorting of the arrays could lower complexity but I'm not certain. At least it's a lead.
    Last edited by razzle; November 30th, 2013 at 05:45 PM.

  3. #3
    Join Date
    Nov 2013
    Posts
    3

    Re: A better algorithm

    Thanks for your reply and a nice suggestion to insert C values in hash table as this will require smaller hash table. But by better I mean less than O(n^2) let say O(n*logn) as your solution is also O(n^2)

    Regards,
    Arka

  4. #4
    Join Date
    Apr 2000
    Location
    Belgium (Europe)
    Posts
    3,641

    Re: A better algorithm

    If you are allowed to sort A, B and C then yes, you can maybe find this in fewer steps by eliminating impossible values in A, B and C.
    Technically possible without sorting as well but the elimination phase will be more complex.

    Additionally, by having tables sorted, you can early out on each possible step.

    This'll be more complex, so it will only pay off for sufficiently large values of n.

  5. #5
    Join Date
    Jul 2013
    Posts
    168

    Re: A better algorithm

    Quote Originally Posted by arka.sharma View Post
    as your solution is also O(n^2)
    I know that and I said so. Still my solution gives a much smaller hash table and is much easier to parallelize, and that's not too bad.

    I've given this another thought and considered sorting of the arrays as a way to reduce complexity but unfortunately I couldn't come up with anything. Sorting may improve speed but how much will be data dependent. It will dependend on the values stored in A, B and C so it's not a complexity reduction.

    In principle sorting can accomplish that all A+B values that lie outside the range of the C values can be discarded in advance. This takes O(n * log n) to set up. But if all A+B values fall within the C range nothing is gained.

    So the algorithm still is O(n*n) and to me it seems very much like that's the best that can be accomplished. I'll have to leave it at that.
    Last edited by razzle; December 5th, 2013 at 09:36 PM.

  6. #6
    Join Date
    Nov 2013
    Posts
    3

    Re: A better algorithm

    One more advantage of inserting C[k] in hash table is also there will be no collision as none of the array contains duplicate on the other hand a[i] + b[j] for different (i,j) pair might produce same value and result into collision while inserting in hash table.So nice improvement. Thanks razzle.

  7. #7
    Join Date
    Jul 2013
    Posts
    168

    Re: A better algorithm

    ---
    Last edited by razzle; December 9th, 2013 at 07:45 AM.

  8. #8
    Join Date
    Oct 2008
    Posts
    1,077

    Re: A better algorithm

    regarding the optimal algorithm complexity, clearly you can always find a worst case input where the number of (i,j,k) such as ck=ai+bj scales as n^2, so, their mere enumeration implies an algorithm no better then O(n^2); regarding the avarage case instead, you should specify better the probability distribution of the input in order to give an answer.

    anyway, assuming A and B sorted, non negative and without duplicates, an algorithm that doesn't require a hash table consists in traversing, for every k and starting from the biggest A[i]<=C[k], the strip of A[i]+B[j] closest to C[k]; this can be done in O(N) hence giving an overall complexity again of O(N^2).

    BTW, if you're just interested in knowing, for every k, how many the (i,j) such as ck=ai+bj are, and you're willing to trade space for speed, then this can be computed via a convolution that, in turn, can be computed in O(nlogn) with the help of the FFT. This information could also be used to speed up the full O(N^2) algorithm when N becomes very big ...

  9. #9
    Join Date
    Jul 2013
    Posts
    168

    Re: A better algorithm

    Quote Originally Posted by superbonzo View Post
    regarding the avarage case instead, you should specify better the probability distribution of the input in order to give an answer.
    That's not how you establish the average complexity of an algorithm. You do that by assuming average input.

  10. #10
    Join Date
    Oct 2008
    Posts
    1,077

    Re: A better algorithm

    Quote Originally Posted by razzle View Post
    That's not how you establish the average complexity of an algorithm. You do that by assuming average input.
    ... and how do you define the "avarage input" ? whether it's done explicitly or not, it always boils down to the choice of a family of prior probability distributions.

  11. #11
    Join Date
    Apr 2000
    Location
    Belgium (Europe)
    Posts
    3,641

    Re: A better algorithm

    You can reduce O(n) to O((n-m)) (with m being [0..n]) with a setup stage. With a large value of n, even a small value of m can make a huge difference in runtime.


    the problem can't be further complexity reduced, but you can increase performance to less than the flat squareroot of n (or n-m)
    Technically, since you can't guarantee a reduction is even possible the technical difficulty of this problem remains the original O(n).


    Just because a problem is O(anything) doesn't mean it's runtime is necessarily linear to that 'anything'. It gives you a rough guide/estimate to worst case execution time.

  12. #12
    Join Date
    Jul 2013
    Posts
    168

    Re: A better algorithm

    Quote Originally Posted by superbonzo View Post
    ... and how do you define the "avarage input" ? whether it's done explicitly or not, it always boils down to the choice of a family of prior probability distributions.
    Your approach will measure the average performance of the algorithm for a certain input. It will not determine the average-case complexity of the algorithm in general.
    Last edited by razzle; December 14th, 2013 at 03:06 AM.

  13. #13
    Join Date
    Apr 2000
    Location
    Belgium (Europe)
    Posts
    3,641

    Re: A better algorithm

    Quote Originally Posted by razzle View Post
    You approach will measure the average performance of the algorithm for a certain input. It will not determine the average-case complexity of the algorithm in general.
    No no no no no....

    O(n) complexity means that you can calculate/guess a WORST CASE performance for n-input. (this isn't the same as the complexity for a worst case dataset !).

    the O-complexity typically gives a "average case complexity" (=average datasets) but for many algorithms you get a best case/average case/worst case complexity notation.

    Take quicksort.
    THis has a O(n) worst case complexity, and thus, assuming the worst possible datasets:
    if you can sort 10 items in 100 (10) seconds, you can "guestimate" 100 items to sort in no more than 10000 (100)seconds. it's entirely possible that some internal optimisation or shortcut makes it run considerably faster, but it shouldn't however take significantly more than 10000.


    The "average case" scenario for quicksort is O(n log(n))
    again, this means that if we have an "average dataset" we can guestimate a worst case runtime for an n-input set.


    Optimsations that can "early out"/"shortcut out" of specific data conditions don't/can't influence the complexity formula because they're data dependant, but can make code run in considerably less time than the O-complexity would indicate.

  14. #14
    Join Date
    Jul 2013
    Posts
    168

    Re: A better algorithm

    Quote Originally Posted by OReubens View Post
    No no no no no....
    You seem to agree with me in principle so would you please be more specific about what you are opposed to.

    In short my point is that there are two kinds of complexity averages. One is in response to one specific dataset (say a certain probability distribution) and the other is in response to any possible dataset (any length, any distribution, any everything). They may coinside but one cannot assume that.

    In my view it's the second kind that's the proper average-case complexity measure. It's because it measures the inherent rather than the relative complexity of an algorithm.

  15. #15
    Join Date
    Oct 2008
    Posts
    1,077

    Re: A better algorithm

    Quote Originally Posted by razzle View Post
    One is in response to one specific dataset (say a certain probability distribution) and the other is in response to any possible dataset (any length, any distribution, any everything).
    an avarage "with respect to everything" is either non-sense or it can be reduced to the choice of a ( possibly degenerate and/or a family of ) probability space. Yes, one can define probability theory in terms of avarages only ( this can be even very useful, both technically and conceptually, C* algebras used in quantum mechanics being notable examples ) but, conceptually, probabilities and avarages are intrinsically bound concepts ( probabilities are avarages of "counting" observables, avarages are functionals on random variables ( aka, measureable functions between probability spaces ) ). Of course, in order to define an O() asymptotics one needs to tell apart an indipendent variable "n" somewhere with respect to which probabilities are parametrized, but there are many ways of doing such a thing, and certainly none of them qualifies as "the inherent avarage complexity" of an algorithm ( unless we're speaking of a randomized algorithm, but this is a different story (*) ) ...

    (*) BTW, speaking of quicksort and randomized algorithms, in some academic level algorithm books, authors even refuse to compute a loosely defined avarage complexity ( stressing its poor usefulness in real world scenarios ), they directly compute the complexity of its randomized version where the input is randomly permuted before the non-randomized algorithm is invoked, hence allowing to compute an avarage for each input size 'n'. But, this is not the "inherent" avarage complexity, it's just a way of presenting the result with respect to ( a still restricted ) family of input distributions in disguise.

Page 1 of 2 12 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center