Quote Originally Posted by nuzzle View Post
Well, two sequences are considered equal if you can rotate and renumerate them into each other.

I have a suggestion based on the observation that these two operations don't change the relative distances between equal letters (since the sequence is circular there are two distances really so lets use the shortest).

If you look at this sequence,

1 1 2 4 4 3 3 2

there are 4 letters and thus 4 distances which are 0, 0, 0, 2. Note that it doesn't matter which letter has which distance so they're presented in sorted order. That would be the canonical form of that sequence. Maybe characteristic form would be a better name so I use that instead. Regardless of how the sequence is rotated and/or renumerated like,

4 4 3 3 2 1 1 2 (rotated)
4 4 3 1 1 2 2 3 (renumerated)

the characteristic form stays 0,0,0,2.

Another example is 1 1 2 2 and 1 2 1 2. They have the characteristic form of 0,0 and 1,1 respectively which are different meaning they're not equivalent.

If my suggestion holds it would be possible to get an O(N) algorithm I think.
While this suggestion of using a sorted list of distances (i.e. 'characteristic form') will be invariant under rotations and renumerations, it is possible for two sequences which are not equivalent according to the OP's criteria to have identical 'characteristic forms'.

Here is an example:
121233445656 = 0,0,1,1,1,1
121233454566 = 0,0,1,1,1,1

So both have same characteristic form using the sorted distances, but the first sequence has two consecutive pairs of letters (3344) whereas the second one doesn't (it has two individual pairs - 33 and 66).