
January 20th, 2015, 06:46 AM
#1
Sorting for maximum distance
I have a set of numbers and I have to order them for maximum MSSD (mean successive squared difference). For example, if I have the ordered set
(1,2,3,4,5,6)
this would give me an MSSD of (1^2+1^2+1^2+1^2+1^2+1^2)/5=5/5=1
This is obviously not the best order. The maximum is reached at
(3,5,1,6,2,4)
Here one has MSSD=(2^2+4^2+5^2+4^2+2^2)/5.
I have a simple heuristic in mind for this problem. Start with an empty set and then always add the elements that will enlarge the sum of squared differences the most. With this example it would give the following:
Start with ()
Then add 1 and 6 (this adds 5^2):
(1,6)
then add 5 (this adds 4^2)
(5,1,6)
then 2 (also adds 4^2):
(5,1,6,2)
and then 3 and 4 (both add 2^2):
(3,5,1,6,2,4)
Using simulations, this simple algorithm indeed always gives the best solution. Moreover this solution seems to be unique (except for the reverse order, (4,2,6,1,5,3), which obviously gives the same result). However, I can't proof this. Is this a known problem? Can somebody help me to proof that this heuristic gives indeed the optimal order?
Thanks!

January 20th, 2015, 04:23 PM
#2
Re: Sorting for maximum distance
Originally Posted by ThaScoon
Is this a known problem? Can somebody help me to proof that this heuristic gives indeed the optimal order?
Your problem can be reformulated as an equivalent graph problem.
Each of your numbers is represented by a vertex in a graph. All vertexes are connected by edges forming a complete graph. The length of each edge is the squared difference of the two vertex numbers the edge connects. Finally there's a supervertex which connects with a zero length edge to each of all other vertexes.
The problem now becomes to find the longest roundtrip from the supervertex and back again visiting all other vertexes in the graph once and only once. This is the version of the Travelling Salesman Problem (TSP) where the longest route is sought rather than the shortest.
Your heuristics is a known approximate greedy algorithm for the TSP,
http://en.wikipedia.org/wiki/Nearest...bour_algorithm
(in your case it becomes the furthest neighbour algorithm instead)
This problem is NPhard unless its special structure can be exploited somehow, maybe even render the heuristics algorithm exact indeed. I cannot tell but I wouldn't be too optimistic. On the other hand the heuristics is very efficient and seems to be close to exact for this problem and that's at least some comfort.
Last edited by razzle; January 21st, 2015 at 03:33 AM.

January 21st, 2015, 06:00 AM
#3
Re: Sorting for maximum distance
Originally Posted by razzle
This problem is NPhard unless its special structure can be exploited somehow, maybe even render the heuristics algorithm exact indeed. I cannot tell but I wouldn't be too optimistic. On the other hand the heuristics is very efficient and seems to be close to exact for this problem and that's at least some comfort.
Thanks for your reply. I indeed fear that this heuristic does not always find the optimal order. Finding an example which proofs my heuristic to be wrong would obviously be as helpful as a full proof for this heuristic. When I am working with squared differences, this heuristic seems to work for all my simulations (I tested it on 1000 sets of 10 random numbers). When I work with the square root of the absolute distance, the heuristic fails. With squared differences, bigger differences add more to the MSSD (mean squared successive difference) as smaller ones.
For example an absolute distance of 4 and one of 2 (e.g in {2,6,8}) adds more to the MSSD than two absolute distances of 3 (e.g in {2,5,8}):
4^{2}+2^{2}>3^{2}+3^{2}.
This is why bigger distances can be prioritized. With the square root of the absolute distances this is not longer the case:
4^{1/2}+2^{1/2}>3^{1/2}+3^{1/2}
The heuristic breaks down. Because of this special structure, this properties of the squared distance, I still have hope to find a proof for the heuristic.

January 21st, 2015, 07:29 AM
#4
Re: Sorting for maximum distance
Originally Posted by ThaScoon
I still have hope to find a proof for the heuristic.
Well, your problem is special in that the graph is complete (every vertex is connected with every other vertex) and that the edge lengths are interdependent (because of the way they are constructed from the vertex numbers). But it's still a variation of the TSP problem which is known to be in NP.
The TSP is well studied and if some specific graph structure would render it Psolvable this most certainly would be widely known. So although working very well, your heuristic most likely will be approximate (and not exact) regardless of how you construct the edge lengths, but to prove or disprove that is beyond me.
Good luck!
Last edited by razzle; January 27th, 2015 at 12:08 AM.
Tags for this Thread
Posting Permissions
 You may not post new threads
 You may not post replies
 You may not post attachments
 You may not edit your posts

Forum Rules

Click Here to Expand Forum to Full Width
