Click to See Complete Forum and Search --> : Threads BLOCKED by sun.security.pkcs11.wrapper.PKCS11.C_CloseSession


abru
October 4th, 2010, 06:45 AM
Hi,

We are facing a specific issue where the most of the threads are being BLOCKED waiting for a lock to get released.

==================================================================================
INFO | jvm 1 | 2010/09/14 04:51:32 | "http-0.0.0.0-443-31" daemon prio=10 tid=0x030c3c00 nid=0x4062 waiting for monitor entry [0x0e91f000..0x0e91faf0]
INFO | jvm 1 | 2010/09/14 04:51:32 | java.lang.Thread.State: BLOCKED (on object monitor)
INFO | jvm 1 | 2010/09/14 04:51:32 | at java.security.SecureRandom.nextBytes(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | - waiting to lock <0x219d12b8> (a java.security.SecureRandom)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.RandomCookie.<init>(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.ServerHandshaker.clientHello(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.ServerHandshaker.processMessage(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | - locked <0xc38fb9b0> (a com.sun.net.ssl.internal.ssl.SSLSocketImpl)
INFO | jvm 1 | 2010/09/14 04:51:32 | - locked <0xc38fbb18> (a java.lang.Object)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | - locked <0xc38fbb08> (a java.lang.Object)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:51:32 | at org.apache.tomcat.util.net.jsse.JSSESocketFactory.handshake(JSSESocketFactory.java:120)
INFO | jvm 1 | 2010/09/14 04:51:32 | at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:521)
INFO | jvm 1 | 2010/09/14 04:51:32 | at org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:113)
INFO | jvm 1 | 2010/09/14 04:51:32 | at java.lang.Thread.run(Unknown Source)

==================================================================================
It is possible that the lock is obtained by the below thread and it is also not coming out from C_CloseSession call


INFO | jvm 1 | 2010/09/14 04:55:33 | "http-0.0.0.0-443-28" daemon prio=10 tid=0x020b6c00 nid=0x3fc1 runnable [0x0ff1f000..0x0ff1fa70]
INFO | jvm 1 | 2010/09/14 04:55:33 | java.lang.Thread.State: RUNNABLE
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.wrapper.PKCS11.C_CloseSession(Native Method)
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.SessionManager.closeSession(SessionManager.java:208)
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.SessionManager.access$000(SessionManager.java:48)
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.SessionManager$Pool.release(SessionManager.java:261)
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.SessionManager.releaseSession(SessionManager.java:163)
INFO | jvm 1 | 2010/09/14 04:55:33 | - locked <0x2163c1e0> (a sun.security.pkcs11.SessionManager)
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.Token.releaseSession(Token.java:245)
INFO | jvm 1 | 2010/09/14 04:55:33 | at sun.security.pkcs11.P11SecureRandom.engineNextBytes(P11SecureRandom.java:98)
INFO | jvm 1 | 2010/09/14 04:55:33 | at java.security.SecureRandom.nextBytes(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | - locked <0x219d12b8> (a java.security.SecureRandom)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.RandomCookie.<init>(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.ServerHandshaker.clientHello(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.ServerHandshaker.processMessage(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | - locked <0x2f6eee40> (a com.sun.net.ssl.internal.ssl.SSLSocketImpl)
INFO | jvm 1 | 2010/09/14 04:55:33 | - locked <0x2f6eefb0> (a java.lang.Object)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | - locked <0x2f6eefa0> (a java.lang.Object)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(Unknown Source)
INFO | jvm 1 | 2010/09/14 04:55:33 | at org.apache.tomcat.util.net.jsse.JSSESocketFactory.handshake(JSSESocketFactory.java:120)
INFO | jvm 1 | 2010/09/14 04:55:33 | at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:521)
INFO | jvm 1 | 2010/09/14 04:55:33 | at org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:113)
INFO | jvm 1 | 2010/09/14 04:55:33 | at java.lang.Thread.run(Unknown Source)
==================================================================================

I am running my application on Solaris 10 using Java 1.6. Both tcp_keepalive_interval and tcp_time_wait_interval are set to 6000.

Observed a set of connection (netstat) in CLOSE_WAIT state (30 to 40) during this timeframe.

Any pointers would help

Arjay
October 4th, 2010, 12:53 PM
Often times, when something runs for a while, but the seems to get all 'bound up', it's because connections and resources aren't being closed properly.

If you aren't already doing so, please explicitly close any connections when you are finished using them (and don't wait for garbage collection to do it for you).

The reason is that in some cases, you can run out of resources (i.e. connections) before the gc has a chance to run. When this occurs, it can result in the behavior you are seeing.

Arjay
October 4th, 2010, 12:55 PM
Another thing to check for is to make sure the synchronization objects are released in all code paths. RAII is helpful when applied to synchronization objects to help ensure a release always occurs.