wael.salman
June 17th, 2009, 09:20 AM
Given that you have one string of length N and M small strings of length L . How do you efficiently find the occurrence of each small string in the larger one?
Solution suggested:
You can just sort the M strings in O(l*Mlog(M)).An additional l figures because comparison function of strings of length l is of complexity O(l).
Once these M strings are sorted, we can simply do a binary search on them for each of the N-l+1 continuous substrings of big string. The complexity of this search for each such substring is O(l*logM).
So the complexity of this procedure is O(l*MlogM)+O((N-l+1)*(l*logM)).
For N>>l this reduces to O(l*MlogM)+O(N*l*log(M).
This can be reduced to O((M+N)*l*log(M)).
I believe there is always better solutions... if you have one - just drop and explain it please.
Thank you
Solution suggested:
You can just sort the M strings in O(l*Mlog(M)).An additional l figures because comparison function of strings of length l is of complexity O(l).
Once these M strings are sorted, we can simply do a binary search on them for each of the N-l+1 continuous substrings of big string. The complexity of this search for each such substring is O(l*logM).
So the complexity of this procedure is O(l*MlogM)+O((N-l+1)*(l*logM)).
For N>>l this reduces to O(l*MlogM)+O(N*l*log(M).
This can be reduced to O((M+N)*l*log(M)).
I believe there is always better solutions... if you have one - just drop and explain it please.
Thank you