|
-
June 29th, 2009, 08:30 PM
#1
grouping subsets to minimize variance
Hi,
Hopefully someone can help me with this problem... it's driving me crazy:
I have n sets of values, each containing some arbitrary number of integers.
I need to create a new set of groups, by adding elements within each original set, so that the variance of the new groups is minimized.
Example
S_1 = {1, 2}
S_2 = {3}
S_3 = {1, 3, 4}
Variance here is calculated over {1, 2, 3, 1, 3, 4}
New groups that minimize variance:
G_1 = {3}
G_2 = {3}
G_3 = {4, 4}
Variance here is calculated over {3, 3, 4, 4}
Any idea how to approach this problem? Note that n can be quite large, and the number of elements within each set can be quite large as well.
Thank you!
Lz
-
June 30th, 2009, 12:53 AM
#2
Re: grouping subsets to minimize variance
Is this supposed to be solvable in polynomial time?
-
June 30th, 2009, 01:02 AM
#3
Re: grouping subsets to minimize variance
Not sure. I've been trying to find a polynomial solution, but failed so far.
If you can think of a good approximation method, that can be useful as well.
-
June 30th, 2009, 01:05 AM
#4
Re: grouping subsets to minimize variance
There are only M-N+1 possible averages and their corresponding minimums (I think it is not monotonic -- so cannot binary search)... so possibly solve the problem in one set given an average...? The latter I still do not have a polynomial time solution... would need to think a bit more.
-
June 30th, 2009, 01:15 AM
#5
Re: grouping subsets to minimize variance
I was thinking of a similar approach -- and I believe I could try all possible averages.
The problem, as you point out, is finding the solution within each set in a reasonable amount of time. For a set of 50 elements, for example, trying all possible partitions yields 1.8572e+047 subsets... (and I don't have that much time!)
-
June 30th, 2009, 01:15 AM
#6
Re: grouping subsets to minimize variance
Oops... forgot I also brought in the condition that the final number of numbers is determined... dynamic programming? Then we would have to solve the case where the number of numbers in each set is determined...
-
June 30th, 2009, 01:25 AM
#7
Re: grouping subsets to minimize variance
Dynamic programming would be great, but i am not sure how to do it for this case.
This sounds like a multiple knapsack problem... except that the objective becomes filling the knapsacks to get as close as possible to the capacity...
-
June 30th, 2009, 01:33 AM
#8
Re: grouping subsets to minimize variance
I expanded the equation for one set (given a fixed average A and the final number of numbers in the set)... and it seems that we only need to minimize the following:
(given that the numbers are divided like so | a_1+a_2+a_3 | a_4+a_5 | ... (an example))
sum_pairs(first partition)+sum_pairs(second partition)+...
( a_1 a_2 + a_2 a_3 + a_3 a_1 ) + ( a_4 a_5 ) + ...
... which seems a bit complicated (maybe it's the wrong track).
-
June 30th, 2009, 01:35 AM
#9
Re: grouping subsets to minimize variance
it seems independent on the average!?
-
June 30th, 2009, 01:36 AM
#10
Re: grouping subsets to minimize variance
Sorry: *independent of
Are we allowed to edit our posts??
-
June 30th, 2009, 01:50 AM
#11
Re: grouping subsets to minimize variance
what do you mean by 'independent of the average'? i am not following what you are saying...
-
June 30th, 2009, 01:59 AM
#12
Re: grouping subsets to minimize variance
That means whatever the average is, as long as the n numbers in a set are going to become n' numbers, an optimal way is to partition it in a certain fixed way. (Unless I did my calculations wrongly?).
-
June 30th, 2009, 02:42 AM
#13
Re: grouping subsets to minimize variance
you don't know n' in advance...
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|