CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 3 123 LastLast
Results 1 to 15 of 45
  1. #1
    Join Date
    Feb 2005
    Location
    Denver
    Posts
    353

    std::string and contiguous memory?

    I need some help arguing a point. Here's the situation. I recently wrote some code that required making several calls into a COTS API library where the function calls required one of the arguments to be a non-const pointer to char. I learned a long time ago that you should never mess the internal memory of a std::string. With that in mind, whenever I've run across situations like this I've always copied a std::string into a non-const vector of chars (std::vector<char>). I then pass the address of the first element into the function as such (&v[0]). In using the vector as a buffer like this, I don't care whether the function modifies the argument or not.


    In a recent code review, the so-called resident “expert” on our program indicated that there was no reason to use a vector. I should use the data() method of the std::string to obtain an internal pointer to the string's memory and pass that into the function since the returned memory of this call is guaranteed to be contiguous. Of course, that would require casting the constness away from the pointer. I argued that the implementation of a std::string is at the discretion of the compiler designers. Because of this, the internal memory of a std::string in not guaranteed to be contiguous, although, most probably do implement it that way. I went on to say that messing with its contents has undefined behavior. A vector on the other hand is guaranteed to have contiguous memory. My first question is... am I off my rocker, or are these statements correct?



    He then went on to say that I was “assuming” that the implementation of taking the address of the first element of a vector is well defined. Well, isn't it?



    The gist of all this is that I am either completely out to lunch, or I need some hard evidence to back my claims. What I was wondering is if someone has access to the C++ standards (pre-C++11), could you provide some quoted statements from the document to help prove my point (assuming, of course, I'm not delirious). Any comments, statements, links, or just some simple quotes from reliable sources would be greatly appreciated. Thanks for your help.

  2. #2
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: std::string and contiguous memory?

    Quote Originally Posted by sszd
    I argued that the implementation of a std::string is at the discretion of the compiler designers. Because of this, the internal memory of a std::string in not guaranteed to be contiguous, although, most probably do implement it that way. I went on to say that messing with its contents has undefined behavior.
    You were right, however, since the 2011 edition of the C++ standard, the internal storage of the contents of a std::string is guaranteed to be contiguous:
    Quote Originally Posted by C++11 Clause 21.4.1 Paragraph 5
    The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().
    That said, instead of calling data() and then casting away const-ness, I think that it would be better to use the &str[0] idiom (like how it is for std::vector), after checking that the string is not empty.

    Quote Originally Posted by sszd
    He then went on to say that I was “assuming” that the implementation of taking the address of the first element of a vector is well defined. Well, isn't it?
    Yes, that has been guaranteed since the 2003 edition of the C++ standard.
    Last edited by laserlight; June 27th, 2012 at 09:58 PM.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  3. #3
    Join Date
    Feb 2005
    Location
    Denver
    Posts
    353

    Re: std::string and contiguous memory?

    Thank you for the reply. However, I was actually looking for quotes from the standards previous to C++11 since we are using older gnu compilers and have no intention on upgrading any time soon. Since that's the case, would you still recommend using &s[0]?

  4. #4
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: std::string and contiguous memory?

    Quote Originally Posted by sszd
    However, I was actually looking for quotes from the standards previous to C++11 since we are using older gnu compilers and have no intention on upgrading any time soon. Since that's the case, would you still recommend using &s[0]?
    Since you are compiling using a known set of compilers, you could just check if the given standard library implementations store the contents of std::string contiguously. If one of them doesn't, or if you cannot determine this, then I would not recommend using &s[0] as it is better to be safe than sorry.

    Another thing to consider:
    Quote Originally Posted by sszd
    COTS API library where the function calls required one of the arguments to be a non-const pointer to char.
    If this is due to a legacy interface that is not const-correct, and it is documented that the contents of the array is not modified through that pointer to non-const char, then using data() and casting away const-ness is fine.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  5. #5
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: std::string and contiguous memory?

    Does it really matter knowing if the string stores its internal contents contiguously? Last I checked, "data" and "c_str" are guaranteed to return valid (const) c-strings anyways, regardless of the version.

    If you use these, then you are 100% safe. You can't mutate though...

    Quote Originally Posted by laserlight View Post
    Yes, that has been guaranteed since the 2003 edition of the C++ standard.
    Really? I would have thought it be guaranteed since day 0.
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  6. #6
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: std::string and contiguous memory?

    Quote Originally Posted by monarch_dodra
    You can't mutate though...
    Which is the reason for this thread: the pointer being passed is a pointer to non-const char (though I note that sszd wrote "non-const pointer to char", but I doubt he/she meant "pointer, that is const, to non-const char" ).

    Quote Originally Posted by monarch_dodra
    Really? I would have thought it be guaranteed since day 0.
    I believe it was a defect in the original version, i.e., they forgot to require it. That said, I would bet no serious standard library implementation ever had std::vector store its contents in a non-contiguous fashion.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  7. #7
    Join Date
    Oct 2008
    Posts
    1,456

    Re: std::string and contiguous memory?

    ... and if you need a mutable char array, you could check memory contiguity at runtime, eventually in debug mode, or, in release, eventually falling back to std::vector<char>. Say, something like

    Code:
    bool is_memory_contiguous( const std::string& s )
    {
        return std::adjacent_find( s.cbegin(), s.cend(), []( const char& l, const char& r ) { return &l + 1 != &r; } ) == s.cend();
    }
    
    // used as
    
    std::string s = ...;
    
    _ASSERT( is_memory_contiguous( s ) );
    
    if( !s.empty() )
        some_c_api_call( &s[0], s.size() );
    
    // or
    
    std::string s = ...;
    
    if( !s.empty() )
        some_c_api_call( is_memory_contiguous( s ) ? &s[0] : &std::vector<char>( s.begin(), s.end() )[0], s.size() );
    Last edited by superbonzo; June 28th, 2012 at 11:32 AM. Reason: minor modification to code snippet

  8. #8
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    2,042

    Re: std::string and contiguous memory?

    Quote Originally Posted by sszd View Post
    In a recent code review, the so-called resident “expert” on our program indicated that there was no reason to use a vector. I should use the data() method of the std::string to obtain an internal pointer to the string's memory and pass that into the function since the returned memory of this call is guaranteed to be contiguous. Of course, that would require casting the constness away from the pointer.
    The potential error is in the const-cast. If the memory pointed to by the (const-casted) pointer is not modified by the function, all is fine. But if it is modified, you could be looking at undefined behavior. The standard states that modifying the contents of a const object is undefined behavior. In principle, the string implementation could use a const object under the hood, e.g. for empty strings.
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  9. #9
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: std::string and contiguous memory?

    Quote Originally Posted by laserlight View Post
    Which is the reason for this thread: the pointer being passed is a pointer to non-const char (though I note that sszd wrote "non-const pointer to char", but I doubt he/she meant "pointer, that is const, to non-const char" ).
    Well, he did say the method being passed to guaranteed no mutation would occur, so followed up with a const_cast should be fine.
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  10. #10
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: std::string and contiguous memory?

    Quote Originally Posted by monarch_dodra
    Well, he did say the method being passed to guaranteed no mutation would occur, so followed up with a const_cast should be fine.
    Hmm... could you quote that part? I can't seem to find it among what sszd wrote, and I looked over both posts carefully. Twice
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  11. #11
    Join Date
    Apr 1999
    Posts
    27,449

    Re: std::string and contiguous memory?

    Quote Originally Posted by sszd View Post
    In a recent code review, the so-called resident “expert” on our program indicated that there was no reason to use a vector. I should use the data() method of the std::string to obtain an internal pointer to the string's memory and pass that into the function since the returned memory of this call is guaranteed to be contiguous.
    OK, but I don't understand this:
    Of course, that would require casting the constness away from the pointer.
    Why is this necessary? What's the signature of the called function? If it is not const char*, then why isn't it const char*?
    I went on to say that messing with its contents has undefined behavior.
    As of pre-2011 ANSI C++, yes, modifying the buffer that is returned by data() is undefined behavior.
    A vector on the other hand is guaranteed to have contiguous memory. My first question is... am I off my rocker, or are these statements correct?
    A vector is guaranteed to be contiguous -- this is stated in the ANSI/ISO specification.

    Also, you can show him what one well-respected "resident expert", Scott Meyers, says in one of his books (I think it's Effective STL) -- a vector is guaranteed to be contiguous, and therefore it can be used in legacy C and C++ functions that require a pointer to a contiguous buffer. So who are you to believe, Scott Meyers or your code reviewer?

    Regards,

    Paul McKenzie

  12. #12
    Join Date
    Jan 2006
    Location
    Singapore
    Posts
    6,765

    Re: std::string and contiguous memory?

    Quote Originally Posted by Paul McKenzie
    As of pre-2011 ANSI C++, yes, modifying the buffer that is returned by data() is undefined behavior.
    It will still result in undefined behaviour. It is the &s[0] version that you're thinking of.
    C + C++ Compiler: MinGW port of GCC
    Build + Version Control System: SCons + Bazaar

    Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
    Kindly rate my posts if you found them useful

  13. #13
    Join Date
    Aug 2000
    Location
    West Virginia
    Posts
    7,725

    Re: std::string and contiguous memory?

    Quote Originally Posted by sszd View Post
    Thank you for the reply. However, I was actually looking for quotes from the standards previous to C++11 since we are using older gnu compilers and have no intention on upgrading any time soon. Since that's the case, would you still recommend using &s[0]?
    From the 2003 standard:

    23.2.4 Class template vector

    [1] A vector is a kind of sequence that supports random access iterators. In addition, it supports (amortized)
    constant time insert and erase operations at the end; insert and erase in the middle take linear time. Storage
    management is handled automatically, though hints can be given to improve efficiency. The elements of a
    vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type
    other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().

  14. #14
    Join Date
    Jun 2009
    Location
    France
    Posts
    2,513

    Re: std::string and contiguous memory?

    Quote Originally Posted by laserlight View Post
    Hmm... could you quote that part? I can't seem to find it among what sszd wrote, and I looked over both posts carefully. Twice
    My Bad, I was reading
    laserlight's post.
    Is your question related to IO?
    Read this C++ FAQ article at parashift by Marshall Cline. In particular points 1-6.
    It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.

  15. #15
    Join Date
    Feb 2005
    Location
    Denver
    Posts
    353

    Re: std::string and contiguous memory?

    Quote Originally Posted by laserlight
    Since you are compiling using a known set of compilers, you could just check if the given standard library implementations store the contents of std::string contiguously. If one of them doesn't, or if you cannot determine this, then I would not recommend using &s[0] as it is better to be safe than sorry.
    Well, unless we were using a compiler that supports C++11, I am reluctant to use that construct regardless. And yes, I agree it's better to be safe than sorry. That's why I moved the contents into a vector in the first place.
    Quote Originally Posted by laserlight
    If this is due to a legacy interface that is not const-correct, and it is documented that the contents of the array is not modified through that pointer to non-const char, then using data() and casting away const-ness is fine.
    It is not documented anywhere in the library that the contents are or are not modified. And again, it's better to be safe than sorry. One question though, is data() guarenteed to return a null terminated string, or is that also implementation dependent? If not, this could also potentially cause a core dump by the library call.
    Quote Originally Posted by D_Drmmr
    If the memory pointed to by the (const-casted) pointer is not modified by the function, all is fine.
    Yes, but we don't know that. And all would be fine only if data() returned a null terminated string.
    Quote Originally Posted by Paul McKenzie
    What's the signature of the called function? If it is not const char*, then why isn't it const char*?
    Actually I mentioned that the signature of the COTS library requires a non-const pointer to char. What I should have said is a pointer to a non-const char, or more appropriately, a non-const pointer to a non-const char (i.e. char*), sorry. Anyway, I have no idea why the library calls were written that way. It is an extremely old COTS product that we are simply having to deal with. Oh, and thanks for the Scott Meyers quote!
    Quote Originally Posted by Philip Nicoletti
    From the 2003 standard:
    Thanks for the quote.
    ==========================================================
    So, after all of this, I have a few more questions.
    Does anyone have a quote from the standard that indicates std::string is implementation dependent and is not guaranteed to be in contiguous memory (prior to the C++11 standard of course)?
    Does data() return a char const* to memory that is independent of the memory where the actual string is stored, or is this also implementation dependent?
    Is the location of the data that data() returns guaranteed to be null terminated?
    The answers to these would indicate to me whether or not it’s safe to cast the const-ness away from the return from data(), regardless of whether or not the function is modifying the contents. But I usually steer away from dangerous (IMHO) stuff like this anyway and simply try to play it safe. That’s why I used a vector. My co-worker is very persistent though, so I just needed some proof.

Page 1 of 3 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured