CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 21
  1. #1
    Join Date
    Sep 2011
    Posts
    13

    Exclamation Segmentation Fault ??

    Hi, i have a very weird fatal exception happening very rarely on my application (compiled under gcc - Ubuntu linux)

    it seems that this this function :

    Code:
    bool GetConnect( ) {return m_Connected;}
    causes a Segmentation Fault sometimes (not all the times)

    No the instance of the class is not deleted when i call that function.
    I tried declaring m_Connected as volatile bool m_Connected but still the error happens.
    several different threads also access the function that calls this function.

    Code:
    bool CPDC :: CBoolExpr(Clients *client)
    {
    	if(!client)
    	{
    		return false;
    	}
    	
    	if(client->m_Deleting) // before we delete our instance of Clients* we set this flag. the instance is deleted 240 seconds later.
    	{
    		
    		return false;
    	}
    	
    	if( !client->m_Socket )
    	{
    	    return false;
    	}
    	
    	cout << "CBE4" << endl;
    	if( !client->m_Socket->GetConnected() )
    	{
    		return false;
    	}
    
    	cout << "CBE5" << endl;
    	if(client->m_Socket->HasError() )
    	{
    		return false;
    	}
    
    	cout << "CBEE" << endl;
    	return true;
    }
    it always crashes when it outputs CBE4.
    to make sure that another thread doesnt crash the application i also tried creating the other threads in completely different timings. It crashes every time on the same line. Although it makes no sense at all..

    Did i miss anything?

  2. #2
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    Hi, i have a very weird fatal exception happening very rarely on my application (compiled under gcc - Ubuntu linux)

    it seems that this this function :

    Code:
    bool GetConnect( ) {return m_Connected;}
    causes a Segmentation Fault sometimes (not all the times)
    There is nothing weird about these errors. When you corrupt memory, use pointers incorrectly, or other errors that cause memory corruption, then the behaviour of the application is undefined. It could work all the time, fail sometimes, work on one machine and fail on another. So nothing is "weird".
    No the instance of the class is not deleted when i call that function.
    Great, but how are we, persons who know nothing about the rest of your code, the state of your program at the time of the crash, what really happened before the problem occurred, can confirm this? Too many times, we get persons taking oaths that they are doing this or that, and when it comes time to actually see their code in action, they are not doing what they say they are doing.
    I tried declaring m_Connected as volatile bool m_Connected but still the error happens.
    several different threads also access the function that calls this function.
    So are you or do you use proper synchronization in this multithreaded program? The usage of "volatile" is not a substitute for proper thread synchronization (mutexes, semaphores, etc.).
    to make sure that another thread doesnt crash the application i also tried creating the other threads in completely different timings. It crashes every time on the same line.
    First, does your program run correctly using one thread? If not, then you need to correct that first.

    After that, use proper synchronization objects in your code. Trying to outguess when to create a thread by using "different timings" (whatever that is), or using "volatile" as a poor man's synchronization object isn't going to get the job done.
    Although it makes no sense at all
    If this indeed a threading problem, and you are not familiar with what parts of the code are not thread safe, then anything can happen that may not seem to make sense. Racing conditions, re-entrancy issues, etc. are all of the things that a programmer must be aware of when writing or maintaining multithreaded applications, and little to no experience in these issues will make a seemingly OK looking multithreaded program into one that will fail.

    Regards,

    Paul McKenzie

  3. #3
    Join Date
    Sep 2011
    Posts
    13

    Re: Segmentation Fault ??

    I use multiple threads for mysql operations such as inserting rows, updating rows, deleting rows etc.

    i am aware of synchronization methods such as mutexes. i use boost::thread mutexes.
    but i'm keeping my code clean of object sharing because mutexes with mysql can lead to really bad performance (tested before)

    so instead i just use volatile booleans to check the state of something which worked just fine in most of my applications.


    again i'm not some expert in c++ , i taught myself over the past 2 years and it's my first programming language, but i have experimented a long time with this stuff and i never state anything if i haven't experienced it firsthand.
    the application fails at some point when i compile it in gcc (linux) but works just fine with the VC compiler on windows (based on a 24 hour runtime) so debugging all this has become a real pain in the arse as i have to wait at least 24 hours for it to ''crash'' with that segmentation fault.

    i will try to make the function non-virtual and run it again tonight for another 24 hours...
    just wondered if anyone experienced the same annoying problem.

  4. #4
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    the application fails at some point when i compile it in gcc (linux) but works just fine with the VC compiler on windows (based on a 24 hour runtime) so debugging all this has become a real pain in the arse as i have to wait at least 24 hours for it to ''crash'' with that segmentation fault.
    That is the issue with multithreaded programs. If you have a threading issue, you may never have it come up when you compile with one compiler as opposed to another. You can't even predict if you change compiler settings in Visual C++, the error won't shows up. As a matter of fact, even if it isn't a threading issue, you shouldn't be fooled into thinking your program is OK if it runs on a compiler but fails to run with another compiler. That other compiler where it is not working is giving you the red flag that your program has a bug and you need to fix it.

    And as I stated earlier, any corruption bugs leads to undefined behaviour. This means the application may seem to work, but as soon as you change compiler options, or add/remove code, or use another version or brand of compiler, or you run the program on a different machine, all sorts of "weird" bugs could crop up at runtime (I put "weird" in quotes, since as I stated, there are no weird bugs in C++, unless the compiler itself is producing broken code, and the possibility of that happening is fairly remote).
    just wondered if anyone experienced the same annoying problem.
    Yes, it's called "debugging multithreaded programs". Again, any issue with threading will occur at random times, times that you cannot predict. Every programmer who has worked with multithreaded programs has encountered this issue.
    I use multiple threads for mysql operations such as inserting rows, updating rows, deleting rows etc. i am aware of synchronization methods such as mutexes. i use boost::thread mutexes.
    but i'm keeping my code clean of object sharing because mutexes with mysql can lead to really bad performance (tested before)
    You never mentioned anything about racing conditions, reentrancy, etc. Just because you used synchronization objects doesn't mean you used them correctly.

    Proper usage of synchronization objects, and the usage of the correct synchronization objects is not a trivial topic. When you write the code, you have to identify before even running the program where the potential problems may exist.

    Just to let you know, in the industry, if a company uses multithreading heavily, they will not hire a programmer if they have little to no experience in multithreaded applications, regardless of how much C++ they've done, 2 years or 20 years. A 24-hour wait time before a crash is generated would be considered unacceptable -- you need to identify in the code all of the issues that could occur, and then write the proper code to overcome these issues. If you missed anything, then you go through the code again and attempt to see what parts of the code are using the corrupted variable(s) and identify what could cause the corruption.

    Regards,

    Paul McKenzie
    Last edited by Paul McKenzie; September 24th, 2011 at 04:38 PM.

  5. #5
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    so instead i just use volatile booleans to check the state of something which worked just fine in most of my applications.
    But now it doesn't work. You get what you pay for -- the usage of volatile is IMO a poor substitute for real multithreaded synchronization.

    Just this line alone is suspect:
    Code:
    if(client->m_Deleting) // before we delete our instance of Clients* we set this flag. the instance is deleted 240 seconds later.
    Reading that comment, it seems you're relying on specific timings to do certain things. That is not the way to write a multithreaded program. What is 240 seconds on one machine may be 2 seconds on another.

    You cannot predict anything in terms of timing -- you have to write code that works, regardless if it's 240 seconds or 0.2 seconds later (or before). That requires you to look at your code, and determine if there is any potential for a racing condition or re-entrancy issues.

    Regards,

    Paul McKenzie

  6. #6
    Join Date
    Sep 2011
    Posts
    13

    Re: Segmentation Fault ??

    Like you can see in the function, the instance of that class is never ever used/referred/dereferred if the flag DeleteMe is set because every time we wanna see or change data or anything on that instance we call the CBoolExpr( ) function that checks the flag.

    2 seconds / 240 seconds / 1200 seconds it doesnt matter. It's just a precaution method of dereferrencing memory in 1 thread out of the multiple and safely.
    i already solved the problem by making the function non-virtual and volatile.

    thanks for the feedback however, it helped me solve some other problems i had.
    knowledge once again is infinite, we keep on learning. best regards.

  7. #7
    John E is offline Elite Member Power Poster
    Join Date
    Apr 2001
    Location
    Manchester, England
    Posts
    4,835

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    the application fails at some point when i compile it in gcc (linux) but works just fine with the VC compiler
    That's quite interesting. If anything I'd expect it to be the other way around. I use VC++ as well as gcc and one thing I've noticed about VC++ is that it will almost never allow you to access memory you've deleted. This is especially true for Debug builds. gcc unfortunately is a different kettle of fish. In my experience, it seems perfectly happy to let you to carry on using deleted pointers until the memory gets re-used for something else - at which point you get a sudden and apparently inexplicable crash.

    As Paul described, corrupted pointers give you exactly this kind of problem. Your app works perfectly most of the time but crashes at random for no obvious reason. Or the app works fine on your machine but not on somebody else's.

    If you're confident that this isn't a synchronisation issue, the other most likely cause is that Clients.m_socket is getting overwritten by garbage. Is there a buffer in front of it that's too small? Maybe the buffer is 10 bytes but somewhere, you're writing 11 bytes of data into it. Because of structure packing, this is the kind of problem that might show up in one compiler but not the other.

    Try building the gcc version with the --mms-bitfields flag. This forces your gcc app to use the same structure packing as VC++. It's not a fix but if that causes the problem to go away, it might indicate a data member getting overwritten somehow.

    Of course if you had access to a decent debugger you could track this problem down in a few minutes. Sadly, you're stuck with gdb.
    "A problem well stated is a problem half solved.” - Charles F. Kettering

  8. #8
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    2,042

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    Like you can see in the function, the instance of that class is never ever used/referred/dereferred if the flag DeleteMe is set because every time we wanna see or change data or anything on that instance we call the CBoolExpr( ) function that checks the flag.
    And what happens if that function is called after the object has been deleted? What do you think the value of m_Deleting will be?

    Regardless of when you actually call delete, at the moment you decide that an object should be destroyed, you should consider the object to be invalid. That means no part of your code should access that object. You cannot check this inside the object, because it is invalid. At best, you could use the m_Deleting flag as a debugging tool, i.e. you assert that it is false. However, the only way to assure that assertion would work properly is to prevent any CPDC from being deleted (for debugging purposes only).
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  9. #9
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    Like you can see in the function, the instance of that class is never ever used/referred/dereferred if the flag DeleteMe is set because every time we wanna see or change data or anything on that instance we call the CBoolExpr( ) function that checks the flag.
    So is JohnE and D_Drmmr correct in that you're deleting the instance, and then checking a member of this deleted instance for a value? If that's the case, then that is undefined behaviour.

    When you delete an instance of an object, you cannot use that object any longer.
    Code:
    #include <iostream>
    
    class foo
    {
        public:
              bool m_bDeleting;
    };
    
    int main()
    {
        foo *pFoo = new foo;
        pFoo->m_bDeleting = true;
        delete pFoo;
        std::cout << pFoo->m_bDeleting;  // behaviour is undefined
    }
    That last line of code, where the m_bDeleting flag is being checked, has undefined behaviour. You cannot point to a deleted object and assume you can use its remains for anything.

    Regards,

    Paul McKenzie

  10. #10
    Join Date
    Sep 2011
    Posts
    13

    Re: Segmentation Fault ??

    i think you didn't pay attention at all at the code.

    i do not use the object instance without checking if First : object instance not null

    Code:
    if( !foo )
        return false;
    which works 100%

    Second : if it's indeed not null, object flag for collection is not true

    Code:
    if( foo->DeleteMe )
        return false;
    the only way the the object is deleted right after the

    if( !foo ) line is that the function that called it took about 240.00001 seconds to check if the object instance is null.
    and if that happens then there's some serious performance issue with my application. There's no way remotely even for a primitive CPU with primitive RAM to take 240 seconds to find a memory location.


    Again, i solved the problem.
    object->GetConnected( ) was a virtual function so i made it non virtual and it works great now.
    been running the server a day and no problems so far.

    i think that there's some sort of conflict on multi-threaded applications and virtual functions... since these functions are resolved on runtime... Can't really tell but something similar is happening.

  11. #11
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    i think you didn't pay attention at all at the code.
    The problem is that you didn't post enough code without us having to guess what you're doing. All we see is the code you posted. There is no context whatsoever in the posted code as to what you have done or what you are about to do.
    i do not use the object instance without checking if First : object instance not null
    And what makes this object NULL? Do you set it to NULL? I don't see that anywhere.

    Regards,

    Paul McKenzie

  12. #12
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by JacobNax View Post
    if( !foo ) line is that the function that called it took about 240.00001 seconds to check if the object instance is null.
    and if that happens then there's some serious performance issue with my application.
    If it takes 240 seconds to check if a pointer is NULL, it means you have memory corruption bugs in your application, not a performance issue.
    Again, i solved the problem.
    object->GetConnected( ) was a virtual function so i made it non virtual and it works great now.
    This indicates to me that you've messed up the virtual mechanism by corrupting the v-table. There should not be an issue whether the function is virtual or not.
    been running the server a day and no problems so far.
    I'll be honest with you -- with all you've described, your application is still faulty, and all you did was to temporarily mask the faults by changing from virtual to non-virtual.
    i think that there's some sort of conflict on multi-threaded applications and virtual functions... since these functions are resolved on runtime... Can't really tell but something similar is happening.
    Virtual functions have nothing to do with threads. As I stated, it sounds as if your application has either corrupted the v-table or using an invalid v-table.

    Regards,

    Paul McKenzie

  13. #13
    John E is offline Elite Member Power Poster
    Join Date
    Apr 2001
    Location
    Manchester, England
    Posts
    4,835

    Re: Segmentation Fault ??

    Quote Originally Posted by Paul McKenzie View Post
    Quote Originally Posted by JacobNax View Post
    i think that there's some sort of conflict on multi-threaded applications and virtual functions... since these functions are resolved on runtime... Can't really tell but something similar is happening.
    Virtual functions have nothing to do with threads. As I stated, it sounds as if your application has either corrupted the v-table or using an invalid v-table.
    To be frank, there's very little about gcc that would surprise me but this certainly would. Really Jacob, you need to listen to Paul. gcc has been around a long time. If it had a fundamental conflict between virtual functions and threads, someone would have noticed it by now.

    You said in your original post that the segmentation faults were very rare occurrences and yet you claim you've fixed the problem on the basis of just one test.! This really is the worst kind of programming. There's only one way to fix bugs in your program and that's to analyse and understand them - not to make arbitrary changes which seem to make the problem go away. You'd have probably got the same result by compiling with -mms-bitfields as I suggested a few posts back but that wouldn't have been a solution either. It would just have been an indicator of what might be wrong.

    Don't take this criticism too harshly. Most of us go through a phase like this when we're inexperienced programmers. It feels a bit like alchemy - trying to understand the workings of nature when you don't have sufficient knowledge. But the answer is to gain the knowledge, not to carry on believing in alchemy.
    "A problem well stated is a problem half solved.” - Charles F. Kettering

  14. #14
    Join Date
    Apr 1999
    Posts
    27,449

    Re: Segmentation Fault ??

    Quote Originally Posted by John E View Post
    You said in your original post that the segmentation faults were very rare occurrences and yet you claim you've fixed the problem on the basis of just one test.! This really is the worst kind of programming. There's only one way to fix bugs in your program and that's to analyse and understand them - not to make arbitrary changes which seem to make the problem go away.
    To Jacob:

    What JohnE has described is the classic case in C++ of moving a bug around to another part of your code by making arbitrary changes that have nothing to do with the cause of the problem.

    In your case, you changed from virtual to non-virtual. In other cases, a simple adding or removing of a member variable to a class is enough to mask a bug and seemingly make a program "work". The problem is that you didn't fix the bug -- you just moved it to another part of the application. As soon as you change code again or run it on another computer, that bug is ready to crop up again.

    These problems are due to memory corruption. That corruption can be caused by

    1) accessing arrays beyond the boundaries of the array.

    2) Mismanaging pointers and dynamically allocated memory.

    3) illegal (but syntactically correct) C++ coding, i.e. using malloc() or memcpy() on non-POD types, returning pointers or references to local variables, etc.

    4) In rare cases, not compiling your entire application (you have object code that contains two different definitions of the same variables), causing a conflict when you run your program.

    5) Corruption due to thread issues.

    We have no idea if the rest of your code has not done any of the above. The one that would be of real concern is 3), since an inexperienced C++ programmer would believe he/she is doing things correctly when they are actually wrong. The compiler gives no warning or errors, so the programmer never realizes the error to begin with.

    Regards,

    Paul McKenzie

  15. #15
    Join Date
    Sep 2011
    Posts
    13

    Re: Segmentation Fault ??

    heres some clue about the life-cycle of the Clients* object

    Code:
    			CTCPSocket *NewSocket = pdcc->m_CTCPS->Accept( (fd_set *)&fd );
    
    			if( NewSocket )
    			{
    				NewSocket->NoDelay(true);
    				Clients *client = new Clients(NewSocket, pdcc);
    				client->m_Identified = false;
    				client->m_Chat = pdcc->m_DefaultChat;
    				CONSOLE_Print("[SERVER] New client has connected to the server [ " + NewSocket->GetIPString( ) + "].");
    				m_Clients.push_back(client);
    			}
    client connects to the server...

    Code:
    	for(vector<Clients*> :: iterator i = m_Clients.begin( ); i != m_Clients.end( );)
    	{
    		if( (*i)->m_Socket && (*i)->m_Socket->GetConnected( ) && !(*i)->m_Socket->HasError( ) )
    		{
    			(*i)->m_Socket->SetFD(&fd, &send_fd, &nfds);
    			nfds++;
    			i++;
    		}
    		else
    		{
    			cout << "A client connection was lost." << endl;
    			CleanUpUser((*i));
    			i = m_Clients.erase( i );
    		}
    	}
    client connection lost

    Code:
    void CPDC :: CleanUpUser(Clients *client)
    {
    	client->m_DeleteTime = GetTime();
    	client->m_Deleting = true;
    	m_ClientsDeleting.push_back(client);
    
    }
    dereferrencing the instance safely later

    Code:
    	for(vector<Clients*> :: iterator i = m_ClientsDeleting.begin( ); i != m_ClientsDeleting.end( ); )
    	{
    		if( (*i) )
    		{
    			if( GetTime() - (*i)->m_DeleteTime >= 240 )
    			{
    				delete *i;
                                    (*i) = NULL;
    				i = m_ClientsDeleting.erase( i );
    			}
    			else
    				i++;
    		}
    		else
    			i = m_ClientsDeleting.erase( i );
    	}
    seg fault happens on the CBoolExpr( ) function that validates the Client
    i use that whenever a client sends a 'special' package and when the client disconnects ( we got chat channels that remove Clients if their delete flag is on. )

    Code:
        for(vector<CChat*> :: iterator i = m_ChatRooms.begin( ); i != m_ChatRooms.end( );)
    	{
    		if(!(*i))
    			i = m_ChatRooms.erase( i );
    		else
    		{
    			for(vector<Clients*> :: iterator j = (*i)->m_Users.begin( ); j != (*i)->m_Users.end( );)
    			{
    				if( (*j) && !CBoolExpr( (*j) ) )
    				{
    					Event_UserLeft2( (*j), (*i) );
    					j = (*i)->m_Users.erase( j );
    				}
    				else
    					j++;
    			}
    			
    			i++;
    				
    		}
    	}

    the socket wrapper

    Code:
    class CSocket
    {
    protected:
    	SOCKET m_Socket;
    	struct sockaddr_in m_SIN;
    	bool m_HasError;
    	int m_Error;
    
    public:
    	CSocket( );
    	CSocket( SOCKET nSocket, struct sockaddr_in nSIN );
    	~CSocket( );
    
    	virtual BYTEARRAY GetPort( );
    	virtual BYTEARRAY GetIP( );
    	virtual string GetIPString( );
    	bool HasError( );
    	virtual int GetError( )							{ return m_Error; }
    	virtual string GetErrorString( );
    	virtual void SetFD( fd_set *fd, fd_set *send_fd, int *nfds );
    	virtual void Allocate( int type );
    	virtual void Reset( );
    };
    
    
    class CTCPSocket : public CSocket
    {
    protected:
    	bool m_Connected;
    
    private:
    	string m_RecvBuffer;
    	string m_SendBuffer;
    	uint32_t m_LastRecv;
    	uint32_t m_LastSend;
    
    public:
    	CTCPSocket( );
    	CTCPSocket( SOCKET nSocket, struct sockaddr_in nSIN );
    	virtual ~CTCPSocket( );
    
    	virtual void Reset( );
    	bool GetConnected( );
    	virtual string *GetBuffer( )					{ return &m_RecvBuffer; }
    	virtual void PutBytes( string bytes );
    	virtual void PutBytes( BYTEARRAY bytes );
    	virtual void ClearRecvBuffer( )				{ m_RecvBuffer.clear( ); }
    	virtual void ClearSendBuffer( )				{ m_SendBuffer.clear( ); }
    	virtual uint32_t GetLastRecv( )				{ return m_LastRecv; }
    	virtual uint32_t GetLastSend( )				{ return m_LastSend; }
    	virtual void DoRecv( fd_set *fd );
    	virtual void DoSend( fd_set *send_fd );
    	virtual void Disconnect( );
    	virtual void NoDelay( bool noDelay );
    };
    i dont use memcpy( ) or malloc( )
    and also i don't use arrays because arrays are evil.

    just memset

    Code:
    CSocket :: CSocket( )
    {
    	m_Socket = INVALID_SOCKET;
    	memset( &m_SIN, 0, sizeof( m_SIN ) );
    	m_HasError = false;
    	m_Error = 0;
    }
    Last edited by JacobNax; September 26th, 2011 at 04:46 AM.

Page 1 of 2 12 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured