Tuesday, February 16, 2010

Thread dumps using Commandline

Generic approach would be kill -3 on any unix/linux box ,output would be in std_out file. What if you need to take 4 or 5 thread dumps and std_out file is so huge that you need to use your unix skills to extract those thread dump output from this logs .

You can use the following methods for taking the threaddumps from commandline

#############
Jboss
#############

/jboss/bin/twiddle.sh invoke "jboss.system:type=ServerInfo" listThreadDump > /tmp/threadump.out


#############
Weblogic
#############

1. Source the environment variable.

Example: source /opt/bea/10.0/user_projects/domains/supportapps/bin/setDomainEnv.sh

2. Invoke wlst with following command and connect to the admin server in offline mode

$ java weblogic.WLST

connect ('Admin_Console_user_id','Password','t3://console_url')


Example:
connect ('weblogic','weblogic','t3://supportapps.blogspot.com:7001')

3. Take Thread Dump

threadDump('true', 'file name', 'server_name')

Example:
threadDump('true', 'supportapps.out', 'server1')

#############
Websphere
#############
Generating thread dump/java core using WSAdmin prompt .You can generate Thread Dump or Java core manually using the following

WSAdmin prompt

Using wasadmin prompt,

wsadmin >

set objectName [$AdminControl queryNames WebSphere:type=JVM,process=server1,node=appsrv01_node,*]

$AdminControl invoke $objectName generateHeapDump

$AdminControl invoke $objectName dumpThreads



Once the thread dump is generated you can find out the location of the thread dump from native_stderr.log file. This is

sample of the messages from my native_stderr.log


JVMDUMP007I JVM Requesting Java Dump using '/opt/AppServer/profiles/AppSrv01/javacore.20090705.212408.5480.0001.txt'
JVMDUMP010I Java Dump written to /opt/AppServer/profiles/AppSrv01/javacore.20090705.212408.5480.0001.txt



100% CPU Usage on Solaris caused by webpshere process

Problem


How to determine which thread(s) is consuming the CPU cycles when a Java process has high CPU usage for an extended period of time.


This technote uses the data gathered from technote # 1115625, titled: "MustGather: 100% CPU Usage on Solaris".


Cause If the Application Server causes a spike in CPU usage for several minutes, this technote can help you identify the Java code that is causing the problem.


SolutionUsing the files attached to this technote, the following 3 steps demonstrate how to find the problem thread and the corresponding Java code.


1. Analyze the prstat information to determine the lwp(light weight process) consuming CPU.

The following prstat was generated with the following command:
"prstat -mvL 1 1" . Use of different prstat parameters will give different output but can be used in a similar fashion as described below.

PID USRNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROC/LWPID
16365 root 36 53 0.0 0.0 0.0 0.0 18 0.0 0 178 13K 0 prstat/1
16310 root 46 0.0 0.1 0.0 0.0 4.3 45 5.0 0 58 8 0 java/30
16310 root 5.8 0.2 0.0 0.0 0.0 0.0 94 0.0 0 11 56 0 java/11
16310 root 2.6 0.1 0.0 0.0 0.0 0.0 97 0.0 0 6 18 0 java/5
5158 root 0.5 0.1 0.0 0.0 0.0 0.0 99 0.0 49 30 1K 19 .netscape.bi/1
16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 2 1 11 0 java/23
16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 5 1 7 0 java/13
16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 2 0 java/3
16310 root 0.3 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 2 0 java/17
16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 4 2 6 0 java/16
16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 0 1 1 0 java/18
16310 root 0.1 0.1 0.0 0.0 0.0 0.0 99 0.5 99 0 87 0 java/1
16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 12 0 java/28
16310 root 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 1 0 java/19
290 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.0 21 0 132 0 Xsun/1
Total: 90 processes, 233 lwps, load averages: 1.00, 0.84, 0.66

In the above prstat output, the third and fourth columns provide the amount of time the
process has spent in user and system mode. Ignoring the prstat command the most CPU time is being consumed by LWPID=30.

2. Use the LWPID of 30to find the thread in the pstack output. The following is a snippet from the pstack output:

----------------- lwp# 30/ thread# 50 --------------------
fe753aa8 ???????? (0, 5265c00, fa, 43a85d79, fe74e170, e04803fc)
fb1025b4 ???????? (f60748, e04819b8, fe74e170, f60748, fe763ca0, 16)
.....

fe505288 _start (fe74e170, e1ef5d10, 0, 5, 1, fe401000) + 20
ff36b6e0 _thread_start(f25078, 0, 0, 0, 0, 0) + 40
---------------------------------------------------------

Look for "_thread_start" at the bottom of the thread stack. The first number inside the ( ) is
the "tid". In this case the tid = f25078.


3. Finally, search for f25078in the thread dump output. For WebSphere 5.0.x & 5.1 this is the file native_stdout.log. In the following snippet of the thread dump you will see the tid of f25078.

"Servlet.Engine.Transports : 0" daemon prio=5 tid=0xf25078nid=0x32 runnable [0xe0480000..0xe04819d8]
at java.lang.System.currentTimeMillis(Native Method)
at java.util.Date.(Date.java:161)
at org.apache.jsp._wtime._jspService(_wtime.java:96)
at com.ibm.ws.webcontainer.jsp.runtime.HttpJspBase.service(HttpJspBase.java:89)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
at com.ibm.ws.webcontainer.jsp.servlet.JspServlet$JspServletWrapper.service....
....

Alternatively, you may use the thread# 50 from the pstack output to correlate the prstat and the thread dump. This is the decimal representation of the thread's "nid" in the thread dump. Using the example above, since decimal 50 is 0x32, thread# 50 can be used to find the thread by locating "nid=0x32" in the thread dumps. This is the only method that works if the application is using the alternate threading libraries. To determine which library is being used, please refer to section "Determining which library is currently in use" at link: http://www-1.ibm.com/support/docview.wss?uid=swg21107291



Analyzing the thread stack for this tid will help lead you to the code causing the high CPU usage. In this case the issue most likely comes from the JSP:
_wtime._jspService(_wtime.java:96)


JVM Memory profiling (jmap & jhat)

The newer versions of Sun JDK (1.5, 1.6) has better memory profiling tools available. One such combination is jmap and jhat.Both of the tools are now part of JDK1.6 distribution (located under javahome/bin folder)
To get to a probable source of memory leak (or to figure out which objects are consuming most part of the heap) we have these JDK builtin tools for rescue. Essentially, we can get the heap snapshot with jmap. we can get a heap dump from jmap which can be read by jhat.
Here is how it looks: (I am trying to profile my weblogic11g process and its processID is 23707)

To get current heap usage stats :
===================================

# ./jmap -heap 23707 (23707 being java process ID )



To get list of all objects on the heap (both live and not reachable)
This is my WebLogic11g process.

# ./jmap -histo 23707 (23707 being java process ID )



To get a heap dump :
(I have specified file name as myheap.bin)
# ./jmap -dump:format=b,file=myheap.bin 23707
Dumping heap to /weblogic11g/jdk160_11/bin/myheap.bin ...
Heap dump file created

To read the heap dump thats generated, jhat can be used as follows :
# ./jhat myheap.bin
Reading from myheap.bin...
Dump file created Mon Aug 24 16:20:26 EDT 2009
Snapshot read, resolving...
Resolving 546388 objects...
.......................
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.

jhat would start a webserver and all object allocations can be viewed and queried. The object allocation can be accessed through browser as http://myIP:7000

Here is a snapshot with default access. we find basic allocation information and also different options to query the objects.



We can get object allocation by size as :



If you prefer a GUI based view, you can check MAT(from eclipse) or visualGC.

JBoss Twiddle

Today i was wondering Where exactly Twiddle plays its role ? one can do a lot using GUI but sometimes it would be simplest to use a command line. Thats where twiddle comes into picture.

To query any application server(jvm information) we can follow any of this approach Generic JMX, WebLogic T3, JBoss RMI, Tomcat RMI, WebSphere SOAP .

You can connect to Jboss either using RMI or generic JMX method .

JBoss provides a very simple command-line application, called twiddle,that lets you query MBeans, get and set attribute values, and even invoke operations. If you need to automate access to JBoss, twiddle is the easiest and best tool to use.

The twiddle script sits in the bin directory, next to the startup and shut-down scripts. You can run it from any terminal window, and it’s very easy to use. By default, twiddle connects to localhost:1099 port. If your using any other port its recommended to use -s localhost:differentport


The GET command lets you query an MBean by name. Pass in the name of the MBean and a list of attributes to retrieve:

[bin]$ ./twiddle.sh get jboss.system:type=ServerInfo FreeMemory ActiveThreadCount
FreeMemory=90167064
ActiveThreadCount=46


If you don’t specify any attributes, you’ll get all of them:
[bin]$ ./twiddle.sh get jboss.system:type=ServerInfo
HostAddress=192.168.0.101
AvailableProcessors=1
OSArch=ppc
OSVersion=10.3.9
HostName=toki.local
JavaVendor=Apple Computer, Inc.
JavaVMName=Java HotSpot(TM) Client V
FreeMemory=90898472
ActiveThreadGroupCount=6
TotalMemory=132775936
JavaVMVersion=1.4.2-38
ActiveThreadCount=45
JavaVMVendor="Apple Computer, Inc."
OSName=Mac OS X
JavaVersion=1.4.2_05
MaxMemory=218103808


The set command sets an attribute value on an MBean. This command sets the connection pool size for DefaultDS to 25:

[bin]$ ./twiddle.sh set jboss.jca:name=DefaultDS,service=ManagedConnectionPool \
MaxSize 25
MaxSize=25

You can invoke MBean operations using the invoke command. The out-put will be the return value, if any, of the method. This command asks for garbage collection to be run:

[bin]$ ./twiddle.sh invoke jboss.system:type=Server runGarbageCollector
If you check the console log, you will see the results of running garbage
collection:
18:28:57,779 INFO [Server] Total/free memory: 132775936/91869984
18:28:59,429 INFO [Server] Hinted to the JVM to run garbage collection
18:28:59,431 INFO [Server] Total/free memory: 132775936/9236628


If you are on a remote machine, add the -s option to specify the host you are trying to talk to:
You can run that command from any remote machine to shut down your JBoss instance

[bin]$ ./twiddle.sh -s hostname invoke jboss.system:type=Server shutdown


Below are few examples connected to remote server with authentication enabled

############
To list all mbeans
############
./twiddle.sh -s jnp://IPAddress:1099 -u admin -p xxxx serverinfo –l or
./twiddle.sh -s IPAddress -u admin -p xxxx serverinfo –l


$ ./twiddle.sh -H serverinfo
Get information about the MBean server

usage: serverinfo [options]

options:
-d, --domain Get the default domain
-c, --count Get the MBean count
-l, --list List the MBeans
-- Stop processing options

$ ./twiddle.sh --server=IPAddress:1099 serverinfo --count
460

$ ./twiddle.sh --server=IPAddress:1099 serverinfo --domain
jboss

We can query only required mbeans using 'jboss.jca:*'

./twiddle.sh -s IPAddress -u jbadmin -p XXXX query 'jboss.jca:*'

############
To query the MBeanInfo for an MBean, use the info command
############

./twiddle.sh -s IPAddress -u admin -p XXXX info 'jboss.jca:service=ManagedConnectionPool,name=com.client.class.XYZDataSource'

Description: Management Bean.
+++ Attributes:
Name: BlockingTimeoutMillis
Type: int
Access: rw
Name: BackGroundValidationMinutes
Type: long
Access: rw
Name: PreFill
Type: boolean
Access: rw
Name: State
Type: int
Access: r-
Name: AvailableConnectionCount
Type: long
Access: r-
Name: BackGroundValidation
Type: boolean
Access: rw
Name: ManagedConnectionFactoryName
Type: javax.management.ObjectName
Access: rw
Name: UseFastFail
Type: boolean
Access: rw
Name: ConnectionCount


Use the get command to see current values of the attributes (in example we are trying for AvailableConnectionCount

./twiddle.sh -s IPAddress -u admin -p XXXX get jboss.jca:service=ManagedConnectionPool,name=com.client.class.XYZDataSource AvailableConnectionCount

output
AvailableConnectionCount=50.

Summary of Twiddle information you may require

# JVM Heap Usage
twiddle get "jboss.system:type=ServerInfo" FreeMemory
twiddle get "jboss.system:type=ServerInfo" TotalMemory
twiddle get "jboss.system:type=ServerInfo" MaxMemory

#AP Server Thread Usage & Configuration
twiddle get "jboss.system:type=ServerInfo" ActiveThreadGroupCount
twiddle get "jboss.system:type=ServerInfo" ActiveThreadCount
twiddle get "jboss.jca:service=WorkManagerThreadPool" QueueSize
twiddle get "jboss.jca:service=WorkManagerThreadPool" MaximumQueueSize
twiddle get "jboss.jca:service=WorkManagerThreadPool" MinimumPoolSize
twiddle get "jboss.jca:service=WorkManagerThreadPool" MaximumPoolSize

#Connection Pool Utilization
twiddle get "jboss.jca:name=HPS,service=ManagedConnectionPool" ConnectionCount
twiddle get "jboss.jca:name=HPS,service=ManagedConnectionPool" AvailableConnectionCount
twiddle get "jboss.jca:name=HPS,service=ManagedConnectionPool" MinSize
twiddle get "jboss.jca:name=HPS,service=ManagedConnectionPool" MaxSize
twiddle get "jboss.jca:name=HPS,service=ManagedConnectionPool" InUseConnectionCount

#Get Server Thread Dump
twiddle invoke "jboss.system:type=ServerInfo" listThreadDump

# Force GC
twiddle invoke "jboss.system:type=Server" runGarbageCollector
11:27:14,208 INFO [Server] Total/free memory: 133365760/94608856
11:27:14,676 INFO [Server] Hinted to the JVM to run garbage collection
11:27:14,676 INFO [Server] Total/free memory: 133365760/95472528

For now this was a bit knowledge I shared about twiddling. Hope to extend and share my knowledge more and soon in the nearby future.

Monday, February 15, 2010

Werid Network Interface Issue

Today i had a strange issue while working. One of the webseal server was not able to connect to LB(load balancer).

Simple telnet was failing. Since this host was rebooted recently . tried many other things but in vain.Thought issue could be either a load balancer or routing table on webseal.

telnet 10.39.X.X 80
Trying 10.39.X.X...
telnet: connect to address 10.39.X.X: No route to host


On further checking Systems team confirmed routing table is fine and network team LB is fine. But still telnet was failing :)

I was looking for system messages but hard luck.suddenly while at work for the same issue i noticed something strange,the network interface of the ip,was the problem .It looked weird to me. It was showing TX byte as zero.


ifconfig eth0
eth0 Link encap:Ethernet HWaddr x:x:x:x:x:x
inet addr:10.39.X.X Bcast:10.39.X.X Mask:255.255.255.0
inet6 addr: x::x:x:x:c6aa/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:157 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6610 (6.4 KiB) TX bytes:0 (0.0 b)

TCPDump are pain for an Appadmin to run. So i simply restarted the interface and issue got resolved :)

ifconfig -a

Display info on all network interfaces on server, active or inactive.

ifconfig eth0 down

If eth0 exists would take it down causing it cannot send or receive any information.

ifconfig eth0 up

If eth0 exists and in the down state would return it back to the up state allowing to to send and receive information.

And i could see by TX bytes:65316 (63.7 KiB) having some data :)

so the telnet started working fine and client's site up, and running again.

Thursday, February 4, 2010

Middleware Technologies

I Have been working on Middleware Technologies since the last 8 years, would like to share my experience and tools that have come across while working these years in my career. With rapid growth of Middleware , updating the present technologies till date becomes crucial. The thought this very element made me create this blog so that I can share the knowledge, problems and solutions of the middleware technologies which have been listed as under..

Middleware Technologies : Websphere , Weblogic, Tomcat, JBoss, Siteminder , LDAP, Apache , Sun One Webserver , IIS , TAM etc

Disclaimer : All the documents posted are not completely authored by me. Have collected some of interesting and technology related documents from internet.