Using a linear median algorithm to find the largest subset for some given sum

Suppose an array A of n numbers is given along with an integer M. Assume that an algorithm which finds the median of an unsorted array in linear time is also given. Design a O(n) algorithm that finds the largest subset S = {1, 2, · · · , n} satisfying ∑(i belonging to S) A[i] <= M.

∑ is sum symbol. I can find a algorithm like search to satisfy the time complexity.

Anyone have some simple way to fix it?

Re: Using a linear median algorithm to find the largest subset for some given sum

Quote:

Originally Posted by

**Roderick_Slug**
I can find a algorithm like search to satisfy the time complexity.

Anyone have some simple way to fix it?

One way would be to use the median finding algorithm to split the array into two equal halves and then reorder it so that all elements smaller than the median go into the left partition and the rest into the right. During the reordering also the sum of the left partition is calculated. If this sum >M the splitting continues recursively to the left otherwise to the right, just like a binary search. Eventually no more splitting is possible and the whole array to the left of this final pivot position will contain all elements that are smaller than the pivot element and their sum total will be as close to <=M as possible.

Is this the "like search" algorithm you already have? And what do you mean by "way to fix it"?

Are you looking for an even simpler algorithm or do you just want someone to implement it for you?

Re: Using a linear median algorithm to find the largest subset for some given sum

Thanks a lot. I do have thought about put the smaller elements on left and larger on right. But I forget that I can record them and compare with M. The idea I have thought before leading me into a weird and complicated way. You are genius!

Re: Using a linear median algorithm to find the largest subset for some given sum

Quote:

Originally Posted by

**Roderick_Slug**
But I forget that I can record them and compare with M.

Yes you need to somehow keep track of all left partition sums because it's the sum of all elements to the left of the current pivot element you should compare with M (as the recursive splitting and reordering of elements proceeds with smaller and smaller partitions).

Also note that you shouldn't modify the A array since the S set should hold indexes. I suggest you use a B array which initially holds all indexes from 1 to n (or maybe 0 to n-1). Then you move around these indexes rather than the elements of A. When the algorithm finishes you have the largest subset (of indexes) to the left of the final pivot position in B. These indexes correspond to the actual elements in A which remain at their initial positions throughout.

This algorithm should be O(N) since it's not a complete ordering (it's not a sort which is O(N * logN)). The successive halving gives rise to a series of element (or rather index) moves that are proportional to 1/2 + 1/4 + 1/8 + 1/16 etcetera which has an upper limit of 1 in infinity. This means that the logN part of the full sort in this case instead is a constant like O(N * k) = O(N).

Good luck!

Re: Using a linear median algorithm to find the largest subset for some given sum

There is a linear time algorithm for finding the K largest numbers - http://en.wikipedia.org/wiki/Selection_algorithm. Of course what you want is just enough largest numbers to sum up to at least K.

In the standard selection algorithm, you take a random pivot and then look to see how many numbers fall on each side of it. You then either accept or reject one half, and keep working on the other half. You have just looked at each number in each half, in turn - the cost of each pivot stage is linear, but the amount of data considered at each stage reduces fast enough that the total cost is still only linear.

The cost of a pivot stage will still be only linear if you take the sum of all the numbers above the pivot. Using this, you can work out if accepting all these numbers, together with any numbers previously selected, would give you a collection of numbers that add up to at least K. If it does, you can ditch the other numbers and use the numbers above the pivot for the next pass. If it does not, you can accept all the numbers above the pivot, and use the numbers below the pivot for the next pass. As with the selection algorithm, the pivot itself and any ties give you a few special cases and the possibility of finding an exact answer early.

(So I think you can do this in (randomized) linear time using a modified version of the selection algorithm in which you look at the sum of the numbers above the pivot, instead of how many numbers are above the pivot.

Re: Using a linear median algorithm to find the largest subset for some given sum

Quote:

Originally Posted by

**bestellen**
There is a linear time algorithm ...

I have noticed how your writing style and technical insight differ a lot between posts. And lo and behold I found your above post verbatim here (as first reply),

http://stackoverflow.com/questions/1...-less-than-key

To take credit for work of others is plagiarism and not only ethically wrong but may even be illegal.