is it the best solution?

hi, i have C exam next week, so i started solving questions from previous years exams and run into this question, the answer i come up to is far far from having reasonable length, it's too complicated and long , but i can't come up anything shorter.

the question is: write a function that receives an array of letters , and a pointer to final result string that hasn't allocated yet, the function supposed to take the array of letters and count the appearance of every letter and make the final result string look like that:

received array of letters : "abbacdeefg"
result string: "a 2;b 2;c 1;d 1;e 2;f 1;g 1"
in the result it should be alphabetically printed

here is the code that i wrote, it just doesn't makes sense that this is the actual solution:

Code:

void sortarray(char str[],char * final) { int i = 1,j, extra = 0; char * letters, *lerrptr; int * count, * cerrptr; int size = 0, flag = 0; // allocate memory for one array storing letter and one for storing it's appearance letters = (char *) malloc(sizeof(char) * 1); if(letters == NULL) exit(1); count = (int *) malloc(sizeof(int) * 1); if(count == NULL) exit(1); // fill the first letter to array letters[0] = str[0]; count[0] = 1; size++; // scan the whole string while(str[i] != 0) { // initiate flag for new run flag = 0; for(j = 0 ; j < size ; j++) { // if the letter already appeared before just increment the count and raise the flag that no new allocating is necessary if(str[i] == letters[j]) { count[j]++; flag = 1; } } // if the flag of already appeared hasn't been raised make new allocation for new letter if(!flag) { flag = 0 ; size++; lerrptr = letters; letters = (char *) realloc(letters,sizeof(char) * size); if(letters == NULL) { free(lerrptr); free(count); exit(1); } cerrptr = count; count = (int *) realloc(count,sizeof(int) * size); if(letters == NULL) { free(cerrptr); free(letters); exit(1); } // insert new letter and 1 for the count letters[size-1] = str[i] ; count[size-1] = 1 ; } i++; } // bubble sort the letters for(i = 0 ; i < size - 1; i++) { for(j = 0 ; j < size - 1 - i ; j++) { if(letters[j] < letters[j - 1]) { letters[j] ^= letters[j+1]; letters[j+1] ^= letters[j]; letters[j] ^= letters[j+1]; count[j] ^= count[j+1]; count[j+1] ^= count[j]; count[j] ^= count[j+1]; } } } // check for more 9 appearances to reserve more letters for one more digit for(i = 0 ; i < size ; i++) if(count[i] > 9) extra++; // allocate the memory for the final array that is going to be printed final = (char *) malloc(sizeof(char) * (size * 4) + extra + 1); j = 0; for(i = 0 ; i < size ; i++) { // insert the letter final[i*4 + j] = letters[i]; // insert space final[i*4 + 1 + j] = ' '; // if more then one digit used for the count of appearances if(count[i] > 9) { // first digit final[i*4 + 2 + j] = count[i]/10 + 48; j++; // second final[i*4 + 2 + j] = count[i]%10 + 48 ; } else final[i*4 + 2 + j] = count[i] + 48; // print ; final[i*4 + 3 + j] = ';'; } // add terminating character final[i*4 + j] = 0; // print the result printf(" %s ", final); }

can anyone help me come up with more reasonable solution?

Re: is it the best solution?

Help has been given for this elsewhere.

Re: is it the best solution?

Quote:

Originally Posted by atikot

can anyone help me come up with more reasonable solution?

This really isn't a C question but a data structures & algorithms question.

The standard solution is to introduce a table of counters, one for each search item (letter). Generally a hash table is used but if the range of the search items (letters) is limited an array can be used instead.

This procedure will turn an O(N*N) algorithm into an O(N) one. A huge improvement. You can build a successful consulting carreer on this piece of knowledge alone.

In this particular case you need to sort the result afterwards because of the order requirement. Alternatively, and favourably, the table of counters is inherently sorted. Then no post processing will be needed. You just print the table from beginning to end.

Re: is it the best solution?

Quote:

Originally Posted by nuzzle

This really isn't a C question but a data structures & algorithms question.

The standard solution is to introduce a table of counters, one for each search item (letter). Generally a hash table is used but if the range of the search items (letters) is limited an array can be used instead.

This procedure will turn an O(N*N) algorithm into an O(N) one. A huge improvement. You can build a successful consulting carreer on this piece of knowledge alone.

In this particular case you need to sort the result afterwards because of the order requirement. Alternatively, and favourably, the table of counters is inherently sorted. Then no post processing will be needed. You just print the table from beginning to end.

Wouldnt it be easier in this case to just sort the string first and then browse the string to merge consequent equivalent characters to fit the requirement.

Basically it would transform this string "bhiihbBHIIHB" to this "bbBBhhHHiiII" and then to this "2b, 2B, 2h, 2H, 2i, 2I".

I think that at the very least this algorithm would be equally as efficient as what you suggested.

Re: is it the best solution?

Quote:

Originally Posted by blaurent

I think that at the very least this algorithm would be equally as efficient as what you suggested.

You solution will be O(N*logN) whereas the standard solution is just O(N).

The OP's problem is special due to the alphabetic ordering requirement. Generally it would require an extra sort step to be performed after the standard solution has finished which would result in a worst(*) total complexity of O(N*logN). But if an array is used as table and the letter entries appear in wanted sort order the sort step is avoided and the algorithm stays O(N).

So in summary the standard solution is the fastest. Even with a post-sort it's likely be faster and at the worst equally fast to what you suggested.

(*) The complexity actually is O(N) + O(K*logK) where K is the number of unique letters among the total input of N letters. This becomes O(N*logN) in the worst case when there are no repeated letters that is when K is equal to N.

Re: is it the best solution?

Quote:

Originally Posted by nuzzle

You solution will be O(N*logN) whereas the standard solution is just O(N).

The OP's problem is special due to the alphabetic ordering requirement. Generally it would require an extra sort step to be performed after the standard solution has finished which would result in a total complexity of O(N*logN). But if an array is used as table and the letter entries appear in wanted sort order the sort step is avoided and the algorithm stays O(N).

So in summary the standard solution is the fastest and even with a post-sort it's not slower than what you suggested.

Yes actually the sort step of your algorithm is just a supplement and not an integral part so you can always drop it off in case the requirement changes.

Still I believe that there's a way to optimize your algorithm, say with a preinitialized map which dictates the sort order, say charCounts{'a','b','c','d','e','f','g','h',....} which stores for each entry an integer. Then once you've initialized the counts for each entry, you just iterate once over the items and generate the result string. This would then give an approximate complexity of O(2*N).
With this you could just drop the entire sort feature. On the downside of course the charCounts array needs to be maintained....

Re: is it the best solution?

Quote:

Originally Posted by blaurent

Yes actually the sort step of your algorithm is just a supplement and not an integral part so you can always drop it off in case the requirement changes.

Still I believe that there's a way to optimize your algorithm, say with a preinitialized map which dictates the sort order, say charCounts{'a','b','c','d','e','f','g','h',....} which stores for each entry an integer. Then once you've initialized the counts for each entry, you just iterate once over the items and generate the result string. This would then give an approximate complexity of O(2*N).
With this you could just drop the entire sort feature. On the downside of course the charCounts array needs to be maintained....

In this solution you suggest using a map as table. This will slow down the algorithm to O(N*logN).

The reason is that in the standard solution accesses to the table is O(1). This is the case when the table is hash based or an array. But it's not the case when the table is a map as you suggest. A map is tree based and has O(logN) accesses. One O(N) pass with O(logN) accesses results in an O(N*logN) algorithm (and not O(2*N)).

So the standard solution is hard to beat here. It's O(N) and even with a post-sort step it will be O(N*logN) at the most (when all letters are different but the more repeated letters there are the more it will tend towards O(N)).

Re: is it the best solution?

Quote:

Originally Posted by nuzzle

In this solution you suggest using a map as table. This will slow down the algorithm to O(N*logN).

The reason is that in the standard solution accesses to the table is O(1). This is the case when the table is hash based or an array. But it's not the case when the table is a map as you suggest. A map is tree based and has O(logN) accesses. One O(N) pass with O(logN) accesses results in an O(N*logN) algorithm (and not O(2*N)).

So the standard solution is hard to beat here. It's O(N) and even with a post-sort step it will be O(N*logN) at the most (when all letters are different but the more repeated letters there are the more it will tend towards O(N)).

Ah yes of course! You're right.

Thanks for the explanation.