CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 13 of 13
  1. #1
    Join Date
    Nov 2008
    Posts
    9

    grouping subsets to minimize variance

    Hi,

    Hopefully someone can help me with this problem... it's driving me crazy:

    I have n sets of values, each containing some arbitrary number of integers.
    I need to create a new set of groups, by adding elements within each original set, so that the variance of the new groups is minimized.

    Example
    S_1 = {1, 2}
    S_2 = {3}
    S_3 = {1, 3, 4}
    Variance here is calculated over {1, 2, 3, 1, 3, 4}

    New groups that minimize variance:
    G_1 = {3}
    G_2 = {3}
    G_3 = {4, 4}
    Variance here is calculated over {3, 3, 4, 4}


    Any idea how to approach this problem? Note that n can be quite large, and the number of elements within each set can be quite large as well.

    Thank you!


    Lz

  2. #2
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    Is this supposed to be solvable in polynomial time?

  3. #3
    Join Date
    Nov 2008
    Posts
    9

    Re: grouping subsets to minimize variance

    Not sure. I've been trying to find a polynomial solution, but failed so far.

    If you can think of a good approximation method, that can be useful as well.

  4. #4
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    There are only M-N+1 possible averages and their corresponding minimums (I think it is not monotonic -- so cannot binary search)... so possibly solve the problem in one set given an average...? The latter I still do not have a polynomial time solution... would need to think a bit more.

  5. #5
    Join Date
    Nov 2008
    Posts
    9

    Re: grouping subsets to minimize variance

    I was thinking of a similar approach -- and I believe I could try all possible averages.

    The problem, as you point out, is finding the solution within each set in a reasonable amount of time. For a set of 50 elements, for example, trying all possible partitions yields 1.8572e+047 subsets... (and I don't have that much time!)

  6. #6
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    Oops... forgot I also brought in the condition that the final number of numbers is determined... dynamic programming? Then we would have to solve the case where the number of numbers in each set is determined...

  7. #7
    Join Date
    Nov 2008
    Posts
    9

    Re: grouping subsets to minimize variance

    Dynamic programming would be great, but i am not sure how to do it for this case.

    This sounds like a multiple knapsack problem... except that the objective becomes filling the knapsacks to get as close as possible to the capacity...

  8. #8
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    I expanded the equation for one set (given a fixed average A and the final number of numbers in the set)... and it seems that we only need to minimize the following:
    (given that the numbers are divided like so | a_1+a_2+a_3 | a_4+a_5 | ... (an example))
    sum_pairs(first partition)+sum_pairs(second partition)+...
    ( a_1 a_2 + a_2 a_3 + a_3 a_1 ) + ( a_4 a_5 ) + ...

    ... which seems a bit complicated (maybe it's the wrong track).

  9. #9
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    it seems independent on the average!?

  10. #10
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    Sorry: *independent of
    Are we allowed to edit our posts??

  11. #11
    Join Date
    Nov 2008
    Posts
    9

    Re: grouping subsets to minimize variance

    what do you mean by 'independent of the average'? i am not following what you are saying...

  12. #12
    Join Date
    Jun 2009
    Posts
    19

    Re: grouping subsets to minimize variance

    That means whatever the average is, as long as the n numbers in a set are going to become n' numbers, an optimal way is to partition it in a certain fixed way. (Unless I did my calculations wrongly?).

  13. #13
    Join Date
    Nov 2008
    Posts
    9

    Re: grouping subsets to minimize variance

    you don't know n' in advance...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured