CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 8 of 8
  1. #1
    Join Date
    Jan 2012
    Posts
    2

    Memory corruption with std::map and malloc

    I'm having a problem with corrupted memory. The memory has previously been allocated by malloc() in order to hold an array of struct pollfds. A pointer to this array is stored in a member called 'poll_fds'. There is another member called 'buffers', which is an STL map container. This member is intended to store a number of strings, which are used as buffers, and is keyed on an int, which is the file descriptor from which input is being read. When I attempt to insert a new element into the map, it appears that the memory allocated by malloc() and used to store the struct pollfds is overwritten by at least some of the data (the key) that is inserted into the map.

    I'll give some quick background to this before providing a code snippet and output which demonstrates my problem. This module is part of a piece of software which listens on a network port for an incoming connection (usually only one at a time, but potentially more), accepts the connection and then reads lines of data (call logging records from a telephone system). This data is passed back from the IO_Net module into main(), and then another module (IO_DBase) is used to store this data in a database. In the course of debugging this issue I have cut out the majority of the code, leaving me with only the essential pieces that I need to isolate/demonstrate the issue (what I'm trying to say is that the complete code does properly handle reading data, closing connections, etc., but this has been cut as it is not required to demonstrate the problem...).

    Here's a snippet of code from the accept_conn() method, which accepts the incoming connection, adds a buffer for the new connection, and adds the fd to the array of fds being polled. This particular snippet just highlights the line where I see the corruption, with the surrounding debugging output code:
    Code:
        std::cerr << "pollfds during accept1:";
        // Check that the supplied fd does not already exist in our array
        for (poll_fd = this->poll_fds;
             poll_fd < (this->poll_fds + (this->poll_nfds * sizeof(struct pollfd)));
             poll_fd += sizeof(struct pollfd)) {
                std::cerr << " " << poll_fd->fd << " (";
                fprintf(stderr, "%p", poll_fd);
                std::cerr << ")";
        }
        std::cerr << std::endl;
    
        (this->buffers).insert(std::pair<int, std::string>(new_fd, ""));
    
        std::cerr << "pollfds during accept2:";
        // Check that the supplied fd does not already exist in our array
        for (poll_fd = this->poll_fds;
             poll_fd < (this->poll_fds + (this->poll_nfds * sizeof(struct pollfd)));
             poll_fd += sizeof(struct pollfd)) {
                std::cerr << " " << poll_fd->fd << " (";
                fprintf(stderr, "%p", poll_fd);
                std::cerr << ")";
        }
        std::cerr << std::endl;
    I'll attach a tarball of the source files and Makefile that I'm using. To reproduce, run loggerd, and then start a netcat session to localhost on port 5000 ('nc localhost 5000'). While this first connection is active, run a second netcat session also to localhost on port 5000. Note that on some runs get_callrec() loops infinitely after the first connection is made, presumably because data arrives on the connection that we do not read in this cut-down version of the code. At most a couple of runs are usually required before the problem can be reproduced.

    Here's the output that I get:
    Code:
    pollfds before:
    pollfds after: 3 (0xa4b060)
    polling... polled
    pollfds during accept1: 3 (0xa4b060)
    pollfds during accept2: 3 (0xa4b060)
    pollfds before: 3 (0xa4b060)
    pollfds after: 3 (0xa4b0c0) 4 (0xa4b100)
    polling... polled
    pollfds during accept1: 3 (0xa4b0c0) 4 (0xa4b100)
    pollfds during accept2: 3 (0xa4b0c0) 5 (0xa4b100)
    pollfds before: 3 (0xa4b0c0) 5 (0xa4b100)
    terminate called after throwing an instance of 'std::ios_base::failure'
      what():  Attempt to add already existing fd (5) to the poll_fds array
    Aborted
    The first occurrence of 'pollfds before' and 'pollfds after' is the call to pollfd_add() which adds the listener, the second occurrence is when the first connection is added, and on the third call to pollfd_add() an exception is raised because the fd already exists in the poll_fds array (as a result of memory corruption). Note the change in the poll_fds array between 'pollfds during accept1' and 'pollfds during accept2', just before the second fd is added (i.e. when the second concurrent connection is made). The only line of code between these output statements is the line which inserts a new element into the buffers map.

    I'm sure this is probably just something silly that I've done, but I can't see what it is. Any help would be most appreciated.
    Attached Files Attached Files

  2. #2
    VictorN's Avatar
    VictorN is offline Super Moderator Power Poster
    Join Date
    Jan 2003
    Location
    Hanover Germany
    Posts
    20,396

    Re: Memory corruption with std::map and malloc

    Don't use malloc/free in C++ code.
    If you really need to allocate something in the heap - use new/delete (or new[] / delete[]) instead.
    Victor Nijegorodov

  3. #3
    Join Date
    Oct 2008
    Posts
    1,456

    Re: Memory corruption with std::map and malloc

    this is strange: you write poll_fd = this->poll_fds; poll_fd < (this->poll_fds + (this->poll_nfds * sizeof(struct pollfd))); poll_fd += sizeof(struct pollfd)) as if poll_fd were a pointer to byte, but then you write poll_fd->fd as if it were a pointer to some struct ( pollfd? ) ... is this your intent ?

  4. #4
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Memory corruption with std::map and malloc

    Quote Originally Posted by ChrisReeves View Post
    I'm having a problem with corrupted memory. The memory has previously been allocated by malloc() in order to hold an array of struct pollfds.
    Code:
    struct pollfd* poll_fds;
    //...
    sizeof(struct pollfd))
    In C++, there is no need to constantly state "struct" xxxx. The "struct" keyword is not necessary in the cases above.
    Code:
    pollfd* poll_fds;
    //...
    sizeof( pollfd );
    That is all you need to do.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; January 25th, 2012 at 10:29 AM.

  5. #5
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Memory corruption with std::map and malloc

    Quote Originally Posted by ChrisReeves View Post
    I'm having a problem with corrupted memory. The memory has previously been allocated by malloc() in order to hold an array of struct pollfds. A pointer to this array is stored in a member called 'poll_fds'.
    All of that could be simplified by using std::vector.
    Code:
    #include <vector>
    
    class IO_Net {
      private:
        int listen_fd;                      // fd of the listener
    
        std::map<int, std::string> buffers; // buffers for current data connections
    
        std::vector<pollfd> poll_fds;            // structures for polling all fds
    
      public:
        IO_Net(void);
        ~IO_Net(void) {};
    
      private:
        IO_Net(const IO_Net&);
        IO_Net& operator=(const IO_Net&);
    
      public:
        void connect(void);
    
        std::string get_callrec(void);
    
      private:
        void accept_conn(void);
    
        void pollfd_add(int);
        short pollfd_getEvents(int);
    };
    There is now no need to call malloc() and free(), and no need to have a separate variable to hold the count since a vector knows its own size by using the vector::size() member function. Having a separate variable just makes maintenance more difficult, and bugs can easily creep into such a program if you forget to update that variable somewhere. The vector class has push_back(), resize(), and insert() functions to dynamically add elements to itself, so there is no need for complicated malloc() and realloc() calls.

    Anytime you have a C++ program, and you're using new[]/delete[] or malloc()/free() to create dynamic arrays, the std::vector class is practically a drop-in replacement for those operations. A vector is a wrapper for a contiguous array, no different if you were to use malloc() or new[] to create this array. The big difference being that vector is safer, easier to use, and knows how to clean up itself on destruction.

    Another example: Here is that complicated looking code you posted:
    Code:
    for (poll_fd = this->poll_fds;
             poll_fd < (this->poll_fds + (this->poll_nfds * sizeof(struct pollfd)));
             poll_fd += sizeof(struct pollfd)) {
                std::cerr << " " << poll_fd->fd << " (";
                fprintf(stderr, "&#37;p", poll_fd);
                std::cerr << ")";
    Simplified using vector:
    Code:
    void PrintPoll( const pollfd& thePoll)
    {
          std::cerr << " " << thePoll.fd << "\n";
    }
    //...
    for_each(poll_fds.begin(), poll_fds.end(), PrintPoll );
    You practically don't need any code to debug anyway, since the issue of malloc() and free() not acting correctly has been eliminated from the equation. The for_each() is an algorithm function that goes through the container (found in the <algorithm> header).

    I suggest you change your class to use vector, and then fix the compiler errors (since a vector is not a pointer) and remove malloc() and free() calls to create pollfd objects. With my experience, this takes little time to change, regardless of the code size. Once you do that, then your code will become safer and less of a headache to deal with.

    And also, if you are passing pollfd pointers to other functions, then the vector is compatible with those calls also. All you need to do is pass the address of the first element of the vector to be compatible with those functions.
    Code:
    void SomeFunction( pollfds* pPolls, int numPolls)
    {
    //...
    }
    
    std::vector<pollfds> SomeVector;
    SomeVector.push_back(pollfds());  // create a pollfds and place in vector
    
    // now call function
    SomeFunction( &SomeVector[0], SomeVector.size() );
    Since you are now using C++, now is a good time to really start using it, replacing the old 'C' code and using the much safer C++ constructs.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; January 25th, 2012 at 11:10 AM.

  6. #6
    Join Date
    Jan 2012
    Posts
    2

    Re: Memory corruption with std::map and malloc

    Quote Originally Posted by VictorN
    Don't use malloc/free in C++ code.
    If you really need to allocate something in the heap - use new/delete (or new[] / delete[]) instead.
    Thanks. Always good advice, however as you can see from my (commented out) code, I tried that and it made no difference to this particular issue.

    Quote Originally Posted by superbonzo
    this is strange: you write poll_fd = this->poll_fds; poll_fd < (this->poll_fds + (this->poll_nfds * sizeof(struct pollfd))); poll_fd += sizeof(struct pollfd)) as if poll_fd were a pointer to byte, but then you write poll_fd->fd as if it were a pointer to some struct ( pollfd? ) ... is this your intent ?
    No, and I think this is where the error has crept in. I've malloc()ed the correct amount of memory, but I'm (consistently) operating on memory outwith that range - memory which is later allocated to the 'buffer' map. Fixing this would probably have resolved the issue.

    Quote Originally Posted by Paul McKenzie
    In C++, there is no need to constantly state "struct" xxxx. The "struct" keyword is not necessary in the cases above.
    Thanks for this. In the course of debugging the issue I have tried to be as verbose as possible. Removing the struct keyword makes no difference to this particular issue.

    Quote Originally Posted by Paul McKenzie
    There is now no need to call malloc() and free(), and no need to have a separate variable to hold the count since a vector knows its own size by using the vector::size() member function. Having a separate variable just makes maintenance more difficult, and bugs can easily creep into such a program if you forget to update that variable somewhere. The vector class has push_back(), resize(), and insert() functions to dynamically add elements to itself, so there is no need for complicated malloc() and realloc() calls.
    Indeed, I use vectors for other members of the class and they make life much easier when dealing with arrays.

    Quote Originally Posted by Paul McKenzie
    Anytime you have a C++ program, and you're using new[]/delete[] or malloc()/free() to create dynamic arrays, the std::vector class is practically a drop-in replacement for those operations. A vector is a wrapper for a contiguous array, no different if you were to use malloc() or new[] to create this array. The big difference being that vector is safer, easier to use, and knows how to clean up itself on destruction.

    You practically don't need any code to debug anyway, since the issue of malloc() and free() not acting correctly has been eliminated from the equation. The for_each() is an algorithm function that goes through the container (found in the <algorithm> header).

    I suggest you change your class to use vector, and then fix the compiler errors (since a vector is not a pointer) and remove malloc() and free() calls to create pollfd objects. With my experience, this takes little time to change, regardless of the code size. Once you do that, then your code will become safer and less of a headache to deal with.

    And also, if you are passing pollfd pointers to other functions, then the vector is compatible with those calls also. All you need to do is pass the address of the first element of the vector to be compatible with those functions.
    And I think this last paragraph is the critical point here. I had originally thought of using vectors to store my pollfds, however there is the requirement to pass a pointer to the array in my call to poll(). I didn't realise that passing the address of the first element of the vector would be equivalent (i.e. that vector used a c-style array 'internally'). I've just tested my code after converting poll_fds to a vector instead of a pointer to an array and the issue seems to have been resolved.

    Can you just confirm that this behaviour conforms to the spec, and isn't just an artefact of the way vector was implemented? It sounds like it's probably part of the spec, but I just wanted to check as I wasn't aware of it until now and it could be very useful.

    Thanks to all for your help, it is much appreciated!

  7. #7
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Memory corruption with std::map and malloc

    Quote Originally Posted by ChrisReeves View Post
    Can you just confirm that this behaviour conforms to the spec, and isn't just an artefact of the way vector was implemented? It sounds like it's probably part of the spec, but I just wanted to check as I wasn't aware of it until now and it could be very useful.
    Vectors are guaranteed to use contiguous memory, else they are not vectors, so there is nothing to worry about. It's the only STL container that is guaranteed to store its contents in contiguos memory, unlike a std:eque or std::list.

    One of Scott Meyer's books mentions this and the usage of vector with legacy functions requiring a pointer.

    The only thing you have to watch out for is if the vector is empty. Since there is no 0 element if the vector is empty, you can't readily use the address of something that doesn't exist. So always check for vector::empty() if in your program, there can be cases where there is no data in the vector.

    Regards,

    Paul McKenzie

  8. #8
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588

    Re: Memory corruption with std::map and malloc

    I think you are probably having issues with re-entrant code. If I understand your design correctly, multiple connections can be active at any time, so your program is running listeners concurrently. If that is the case, then your caches have to be accessed in a proper sequence with Locks and semaphores etc. The STL containers are not thread-safe by the standard, and anyways, your locking mechanism should depend on your high-level access functions of the cache.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured