How to figure time complexity of algorithm

Lets say you have to find a pair of matching socks. You could:

1) grab sock from basket.

2) grab another sock from basket. If it doesn't match throw it to the side. If it matches goto line 4.

3) repeat line 2

4) your done.

How would you go about figuring time complexity of this algorithm or the following one.

1) grab sock from basket lay it on the bed.

2) grab another sock from basket. If it doesn't match any previous socks already on bed lay it on the bed. If it matches any previous socks goto line 4.

3) repeat line 2.

4) your done.

How would you find it from these two cases.

Thanks in advance.

Jack

Re: How to figure time complexity of algorithm

Quote:

Originally Posted by

**left1none**
How would you find it from these two cases.

The first algorithm looks for **a specific pair** of socks in a sequence. It picks a first sock and then searches sequentially for a match.

The second algorithm looks for **any first pair** of socks in a sequence. It pick socks sequentially and compares them with all socks encountered so far which are kept on a "bed". To make this efficient the simplest way is to use an array (or hash table). Checking whether a sock matches one on the bed then becomes a constant operation that is O(1).

The complexity of the first algorithm is straightforward. It's O(N). Since stocks are picked at random with equal probability the time it takes to find a match will be proportional to the number of socks.

The second algorithm is harder and I'm not quite sure. It cannot be worse than O(N), rather the opposite. The question is whether finding the first pair of socks is linearly dependent on the number of socks (like it was finding a specific pair of socks), or if maybe there is a logarithmic dependence? In that case this algorithm will be O(log N). I'll be back.

Re: How to figure time complexity of algorithm

depends wht you're after.

if you only want 1 match, then the 1st solution is faster.

the 2Nd solution is slower since it will need more compares on average, however it'll be faster if you need to find multiple matches.

the 2nd solution isn't providing for the fast that there may be more socks than can fit on the bed and still keep it distinguishable enough.

Somethig people often forget in sorting algo's since they tend to deal with integers...

most algo's tend to revolve around fewest swaps.

the time for a compare however may be the bottleneck rather than the time for a swap, so an algorith that does fewer compares may be preferable over one that does fewer swaps. There are cases where insertion sort will be better than a quicksort because of this.

Re: How to figure time complexity of algorithm

Quote:

Originally Posted by

**OReubens**
the 2Nd solution is slower since it will need more compares on average, however it'll be faster if you need to find multiple matches.

Not if the "bed" is implemented using an array or hash table.

Re: How to figure time complexity of algorithm

Quote:

Originally Posted by

**OReubens**
depends wht you're after.

if you only want 1 match, then the 1st solution is faster.

Yeah for one sock the 1st is way faster. I implemented it in software to see. When the number of pairs of socks increase the time increases quite a bit (maybe even exponentially). But I still wish I understood more on how to figure out time complexity.

Quote:

Originally Posted by

**OReubens**
the 2Nd solution is slower since it will need more compares on average, however it'll be faster if you need to find multiple matches.

the 2nd solution isn't providing for the fast that there may be more socks than can fit on the bed and still keep it distinguishable enough.

That is interesting, I never thought about that matching all socks would be quicker this way. I may have to implement that next to see. Thanks for the reply man.

Jack

Re: How to figure time complexity of algorithm

Quote:

Originally Posted by

**dazzle**
I'll be back.

This is a continuation of my post #2.

The second algorithm finds the first pair of matching socks drawn from a basket containing N socks (N/2 pairs). Socks are picked one by one from the basket and compared for a match with all socks already drawn. To make the comparision efficient all previously drawn socks are kept in a set which allows for insertion and membership testing in constant time, that is O(1). Such a set could be implemented using a simple array.

One can convince oneself that if n socks have been drawn already the probability p that the next sock will be a match is p(n) = n / (N-n). The complement 1 - p(n) is the probability of a no-match. As a plausibility check one may evaluate the extreme values p(0) and p(N/2) which turn out 0 and 1 respectively as they should. The first means that the very first sock drawn cannot be a match. The second means that if one sock has been drawn from each and every pair the next sock will be a match for certain.

The next step is to have a look at the probability distribution function. What's the probability that exactly the x'th sock drawn is a match? First a series of non-matching socks must be drawn followed by the matching sock. This basically is a "first success" geometric distribution with the twist in this case that the probability of success varies. Expressed in p(n) it becomes,

pdf(x) = [(1-p(0)) * (1-p(1)) * (1-p(2)) * ... * (1-p(x-2))] * p(x-1)

This can now be used to calculate the expectation, that is the number of socks that have to be drawn on average to get a match. Using the definition of expected value it becomes (the sum of the products of all possible outcomes times their probabilities of occuring),

E(N) = 1*pdf(1) + 2*pdf(2) + 3*pdf(3) + ... + (N/2+1)*pdf(N/2+1)

Here's the code,

Code:

`double p(int n, int N) { // chance to draw a match after n socks (of N) have been drawn.`

return (n<=0) ? 0.0 : (n>=N/2) ? 1.0 : double(n)/double(N-n);

}

double pdf(int x, int N) { // prob. distr. for sock number x (of N) to be the match

double r=1.0;

for (int i=0; i<x-1; ++i)

r *= (1.0 - p(i,N));

return r*p(x-1,N);

}

double E(int N) { // expected number of socks (of N) to draw to find a match

double s=0.0;

for (int i=0; i<=N/2; i++)

s += pdf(i+1,N)*double(i+1);

return s;

}

void socks1() { // actual test program

int N=10;

for (int i=0; i<=N/2; i++)

std::cout << p(i,N) << " ";

std::cout << std::endl;

//

for (int i=0; i<=N/2; i++)

std::cout << pdf(i+1,N) << " ";

std::cout << std::endl;

//

std::cout << E(N) << std::endl;

//

std::cout << "---" << std::endl;

for (int i=1; i<=25; ++i) {

int n = i*1000;

double En= E(n);

std::cout << n << ": " << En << " / " << En/std::sqrt(n) << std::endl;

}

}

The test program first prints p, pdf and E for N=10 to check if it looks okay, which I think it does. The p probabilities go from 0 to 1. The pdf values sum up to 1. And E is about 4 which means the match is most likely to come when 3 non-matching socks have been drawn. Since a match must come at the latest after 5 this seems reasonable.

Then the actual test is performed. I print 25 E values from N=1.000 to 25.000. I also print E(N)/sqrt(N). This quotient turns out to be constant for all N and that, ladies and gentlemen, indicates an O(sqrt N) algorithm.

O(sqrt N) is not quite as good as the O(log N) I anticipated but it still beats the first algorithm which is O(N). For N=25.000 the first algorithm will require 12.500 socks on average to be drawn before a match. This second algorithm only 198. What a difference an array makes! :)