|
-
May 6th, 2008, 11:35 AM
#1
String conversion
Ok, I wanted I nice way to convert a string to lower case, so I thought to myself "Why reinvent the wheel? A quick google search will surely reveal something useful.", so I had a quick look and was truly alarmed to see what most people seem to be doing and suggesting to others. It is basically this:
Code:
void StringToLower(std::string& str)
{
std::transform(str.begin(), str.end(), str.begin(), tolower);
}
This looks bad to me, really bad! To my understanding it is attempting to overwrite the underlying const data buffer of the string in place! The version that I have implemented is as follows:
Code:
std::string StringToLower(std::string& str)
{
std::vector<char> vec(str.begin(), str.end());
std::transform(vec.begin(), vec.end(), vec.begin(), tolower);
return std::string(vec.begin(),vec.end());
}
but this is grossly inefficient. Do any of you know I nice way to convert a string to lower case?
-
May 6th, 2008, 11:50 AM
#2
Re: String conversion
Why do you think it is changing the underlying data buffer? That is only bad when you expose that buffer with c_str() and make changes directly to it. Here, tolower is called on each character in the string through the string's interface, so it is not a problem. You can use your first example.
-
May 6th, 2008, 12:07 PM
#3
Re: String conversion
 Originally Posted by jlou
Why do you think it is changing the underlying data buffer? That is only bad when you expose that buffer with c_str() and make changes directly to it. Here, tolower is called on each character in the string through the string's interface, so it is not a problem. You can use your first example.
I'm not convinced. The inner workings of transform under VS2005 are:
Code:
for (; _First != _Last; ++_First, ++_Dest)
*_Dest = _Func(*_First);
_Dest is the string iterator, which, if you go through the header code, either ends up as a char* or a char[] depending on the circumstances of the string. What this means is that transform is directly acting on the underlying buffer of the string, which seems wrong semantically if nothing else.
-
May 6th, 2008, 12:16 PM
#4
Re: String conversion
I'm not understanding why that's a problem.
-
May 6th, 2008, 12:19 PM
#5
Re: String conversion
 Originally Posted by PredicateNormative
I'm not convinced.
As long as you're accessing through the iterator you're ok. Of course it will end up being a char* and of course the underlying buffer of the string is changed, you are changing the string. The only problem you have is when you change the underlying buffer outside of the string's interface. Here you are doing it through iterators, which are part of the interface and therefore safe.
The string class has an operator[] as part of it's interface. Do you also believe this is unsafe:
Code:
std::string greeting("Hello World");
greeting[0] = "J";
What about this:
Code:
greeting = "Goodbye";
Each of these things is just changing the string through its interface, and all are perfectly safe and acceptable.
Last edited by jlou; May 6th, 2008 at 12:22 PM.
-
May 6th, 2008, 12:31 PM
#6
Re: String conversion
If you are going to use the second form, which might be necessary at times the input string is a const object or a literal, then make the argument be a reference to a *const* object.
Both forms are acceptable.
By the way, why do you use a vector<char> and then make out a std::string out of it? Why not use std::string to begin with?
Can you help me with my homework assignment?, Before you post!, Use code tags, How to post!, Codeguru technical FAQs, C++ FAQ Lite, Stroustrup: C++ Style and Technique FAQ, Guru of the Week, Comeau C and C++ FAQs, Comeau C++ Templates FAQs, CUJ @ DDJ, Spam threshold
My Blogs : Learning C++ is fun | Abnegator's reflections
Open Threads : C++ Aha! Moments | Nature of work in C++?
-
May 6th, 2008, 04:40 PM
#7
Re: String conversion
"This looks bad to me, really bad! To my understanding it is attempting to overwrite the underlying const data buffer of the string in place!"
I believe a non-const iterator is being used to change the data. The purpose of this operation is to change every character of the string. They each have to be modified somehow. The std:string owns it's own data buffer so why is it a problem that the std::string instance is able to change it's own data through an iterator?
I feel your pain though. When I first started writing code that uses the std::string class, it seemed peculiar that there was no tolower member function. I thought that was rather odd but eventually I just got used to it. although it works, it would have been nice if there was just a member function to do this for std::string. I think that would make code a bit easier to read and would just seem more intuitive to me; but I digress.
-
May 6th, 2008, 06:37 PM
#8
Re: String conversion
 Originally Posted by kempofighter
it seemed peculiar that there was no tolower member function.
That may be because in the world of spoken languages, the definition of "lower case" may differ from what English and similar languages believe it should be, and I guess the standards committee didn't want to get involved in creating a function to work for every locale.
Regards,
Paul McKenzie
-
May 6th, 2008, 06:50 PM
#9
Re: String conversion
 Originally Posted by PredicateNormative
Ok, I wanted I nice way to convert a string to lower case, so I thought to myself "Why reinvent the wheel? A quick google search will surely reveal something useful.", so I had a quick look and was truly alarmed to see what most people seem to be doing and suggesting to others. It is basically this:
Code:
void StringToLower(std::string& str)
{
std::transform(str.begin(), str.end(), str.begin(), tolower);
}
This looks bad to me, really bad! To my understanding it is attempting to overwrite the underlying const data buffer of the string in place!
Now that others have pointed out it doesn't do what you say it does, the only thing "bad" about it is that it will not work for various locales and languages. German (I believe) is the one where this method breaks down.
Regards,
Paul McKenzie
-
May 6th, 2008, 07:21 PM
#10
Re: String conversion
i believe you can change the locale and it will work
-
May 6th, 2008, 10:24 PM
#11
Re: String conversion
The problem is locales are not standardized. You will find different compilers supporting different set of locales and locale names. However, if you take some care it is not tough to implement the right algorithm for your compiler. Note - an algorithm - because it in no way really needs to be a member function of std::string. Here is a post that I had made around this, few weeks back - http://learningcppisfun.blogspot.com...omparison.html
Can you help me with my homework assignment?, Before you post!, Use code tags, How to post!, Codeguru technical FAQs, C++ FAQ Lite, Stroustrup: C++ Style and Technique FAQ, Guru of the Week, Comeau C and C++ FAQs, Comeau C++ Templates FAQs, CUJ @ DDJ, Spam threshold
My Blogs : Learning C++ is fun | Abnegator's reflections
Open Threads : C++ Aha! Moments | Nature of work in C++?
-
May 7th, 2008, 04:55 AM
#12
Re: String conversion
Ok, technically there is nothing wrong with it but I just don't like the idea of modifying the string data in that way, it just seems wrong to me. However, since everyone disagrees with me, I think I should bow to the majority ruling and adjust my ideas with regard to the wrongness of the operation. Thanks all.
-
May 7th, 2008, 10:34 AM
#13
Re: String conversion
I ended up implementing the following:
Code:
/** Convert a string of type T (char*, char[] or std::string) to lower case
* @param str T string to be converted.
* @note Usage: std::string var = StringToLower(str);
*/
template <typename T> std::string StringToLower(T& str)
{
std::string tmp = str;
std::transform(tmp.begin(), tmp.end(), tmp.begin(), tolower);
return tmp;
}
Any disagreements/improvements/comments are welcome.
-
May 7th, 2008, 11:09 AM
#14
Re: String conversion
 Originally Posted by PredicateNormative
I ended up implementing the following:
Code:
/** Convert a string of type T (char*, char[] or std::string) to lower case
* @param str T string to be converted.
* @note Usage: std::string var = StringToLower(str);
*/
template <typename T> std::string StringToLower(T& str)
{
std::string tmp = str;
std::transform(tmp.begin(), tmp.end(), tmp.begin(), tolower);
return tmp;
}
Any disagreements/improvements/comments are welcome.
I fail to see the advantage of this, even if you feel it is erroneous to modify a string through its iterators (which it isn't). You're still modifying tmp in the same way you would have modified str, and str is being passed by non-const reference for no reason.
If you really don't want to modify the string through its iterators, for some odd reason or another, you could do:
Code:
std::string StringToLower(const std::string & str)
{
std::string tmp;
std::transform(str.begin(), str.end(), std::back_insert_iterator<std::string>(tmp), tolower);
return tmp;
}
- Alon
-
May 7th, 2008, 11:39 AM
#15
Re: String conversion
Hi Hermit, my mistake, the interface should have been:
Code:
template <typename T> std::string StringToLower(const T& str);
I've conceded on the correctness of modifying the string through its iterators, I just didn't want the input string modified because it would impact other code. Thanks for the suggestion.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|