CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com

1. Junior Member Join Date
Jan 2015
Posts
2

Sorting for maximum distance

I have a set of numbers and I have to order them for maximum MSSD (mean successive squared difference). For example, if I have the ordered set
(1,2,3,4,5,6)
this would give me an MSSD of (1^2+1^2+1^2+1^2+1^2+1^2)/5=5/5=1
This is obviously not the best order. The maximum is reached at
(3,5,1,6,2,4)
Here one has MSSD=(2^2+4^2+5^2+4^2+2^2)/5.

I have a simple heuristic in mind for this problem. Start with an empty set and then always add the elements that will enlarge the sum of squared differences the most. With this example it would give the following:
(1,6)
(5,1,6)
(5,1,6,2)
and then 3 and 4 (both add 2^2):
(3,5,1,6,2,4)

Using simulations, this simple algorithm indeed always gives the best solution. Moreover this solution seems to be unique (except for the reverse order, (4,2,6,1,5,3), which obviously gives the same result). However, I can't proof this. Is this a known problem? Can somebody help me to proof that this heuristic gives indeed the optimal order?

Thanks!  Reply With Quote

2. Member +  Join Date
Jul 2013
Posts
576

Re: Sorting for maximum distance Originally Posted by ThaScoon Is this a known problem? Can somebody help me to proof that this heuristic gives indeed the optimal order?
Your problem can be re-formulated as an equivalent graph problem.

Each of your numbers is represented by a vertex in a graph. All vertexes are connected by edges forming a complete graph. The length of each edge is the squared difference of the two vertex numbers the edge connects. Finally there's a super-vertex which connects with a zero length edge to each of all other vertexes.

The problem now becomes to find the longest round-trip from the super-vertex and back again visiting all other vertexes in the graph once and only once. This is the version of the Travelling Salesman Problem (TSP) where the longest route is sought rather than the shortest.

Your heuristics is a known approximate greedy algorithm for the TSP,

http://en.wikipedia.org/wiki/Nearest...bour_algorithm

This problem is NP-hard unless its special structure can be exploited somehow, maybe even render the heuristics algorithm exact indeed. I cannot tell but I wouldn't be too optimistic. On the other hand the heuristics is very efficient and seems to be close to exact for this problem and that's at least some comfort. Last edited by razzle; January 21st, 2015 at 03:33 AM.  Reply With Quote

3. Junior Member Join Date
Jan 2015
Posts
2

Re: Sorting for maximum distance Originally Posted by razzle This problem is NP-hard unless its special structure can be exploited somehow, maybe even render the heuristics algorithm exact indeed. I cannot tell but I wouldn't be too optimistic. On the other hand the heuristics is very efficient and seems to be close to exact for this problem and that's at least some comfort. Thanks for your reply. I indeed fear that this heuristic does not always find the optimal order. Finding an example which proofs my heuristic to be wrong would obviously be as helpful as a full proof for this heuristic. When I am working with squared differences, this heuristic seems to work for all my simulations (I tested it on 1000 sets of 10 random numbers). When I work with the square root of the absolute distance, the heuristic fails. With squared differences, bigger differences add more to the MSSD (mean squared successive difference) as smaller ones.

For example an absolute distance of 4 and one of 2 (e.g in {2,6,8}) adds more to the MSSD than two absolute distances of 3 (e.g in {2,5,8}):
42+22>32+32.
This is why bigger distances can be prioritized. With the square root of the absolute distances this is not longer the case:
41/2+21/2>31/2+31/2
The heuristic breaks down. Because of this special structure, this properties of the squared distance, I still have hope to find a proof for the heuristic.  Reply With Quote

4. Member +  Join Date
Jul 2013
Posts
576

Re: Sorting for maximum distance Originally Posted by ThaScoon I still have hope to find a proof for the heuristic.
Well, your problem is special in that the graph is complete (every vertex is connected with every other vertex) and that the edge lengths are interdependent (because of the way they are constructed from the vertex numbers). But it's still a variation of the TSP problem which is known to be in NP.

The TSP is well studied and if some specific graph structure would render it P-solvable this most certainly would be widely known. So although working very well, your heuristic most likely will be approximate (and not exact) regardless of how you construct the edge lengths, but to prove or disprove that is beyond me.

Good luck!
Last edited by razzle; January 27th, 2015 at 12:08 AM.  Reply With Quote

algorithm, mathematics, optimization, sorting  Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•