How to troubleshoot WBEM subscriptions?

From Wiki-UX.info
Jump to: navigation, search

What is WBEM?

WBEM is a set of systems management technologies developed to unify the management of distributed computing environments.

Key features of WBEM technology

  • remote management of applications
  • management of several instances of an application as a single unit
  • standard interface for remote application management across different applications
  • decoupling of application management from the client
  • "publishing" of key information about an application to other applications


How does WBEM "work"

A client will use the HTTP (or HTTPS) protocol to pass the request, encoding in CIM-XML, to the WBEM server. The WBEM server will decode the incoming request, perform the necessary authentication and authorization checks and then consult the previously-created model of the device being managed to see how the request should be handled. For most operations, the WBEM server determines from the model that it needs to communicate with the actual hardware or software. This is handled by so-called "providers": small pieces of code which interface between the WBEM server (using a standardised interface known as CMPI) and the real hardware or software.

Note: SMI-S (Storage Management Initiative - Specification) is based on WBEM and is used for SAN devices!

WBEM Ports

Secure port        5989
Standard port     5988
Events port         50004

WBEM Processes

/opt/wbem/lbin/cimserver 
/opt/wbem/lbin/cimservera 
/opt/wbem/lbin/cimserverd   (restarts cimserver process if it terminated abnormally) 
/opt/wbem/lbin/cimprovagt  (several cimprovagt will probably be running)

cimservera (HP-UX only)- cimservera is a standalone process that provides the cimserver with PAM Authentication services. cimservera is controlled solely by the cimserver, and as such has no user interface.

cimserverd (HP-UX only)- HP WBEM Services’ way to automatically restart itself in case of failure. cimserverd is not intended to be used by operators. Users can, however set the interval for cimserverd. To see how to do that, read the cimserverd man page. Users start (and halt) CIM Server with the cimserver command. If the CIM Server was halted by an operator with the cimserver command, cimserverd cannot automatically restart it.


HP-UX Directory Structure

  • /opt/wbem/bin --> WBEM Services utilities: cimmof, wbemexec, cimprovider, osinfo
  • /opt/wbem/sbin --> Executables for WBEM Services configuration commands: cimconfig, cimauth, init_repository, cimtrust, and ssltrustmgr
  • /opt/wbem/lbin --> Internal executables: cimserver, cimprovagt, cimservera, cimserverd, ranseed, and repupgrade
  • /opt/wbem/lib --> Shared library files for HP WBEM Services
  • /opt/wbem/providers/lib --> Links to Shared library files for CIM Providers
  • /opt/wbem/share/man --> Man pages for HP WBEM Services
  • /var/opt/wbem --> Sample/default configuration files. Location of trace files.
  • /var/opt/wbem/localauth -->Temporary directory used during Local Authentication
  • /var/opt/wbem/repository --> Directory containing the CIM Repository
  • /opt/wbem/mof --> Directory containing the initial Schema definition (A.02.05.X or higher)

1. What we need to do is collect all the basic data about the system, the repository files, and run some simple tests to check the health of the system. This can all be achieved using the WbemInfo.sh script:

To run WbemInfo.sh, first download the tool to the /opt/wbem/bin directory:

# tar -xvf wbeminfo..tar
# cd wbeminfo
# ./WbemInfo.sh data_repos

Ask for the /tmp/WbemFiles_B.11.XX.tar.gz file.


2. The second thing we can do, is try to capture some cimserver.trc when the 'subscribe WBEM events' is done from the Windows HP SIM CMS. This should capture the CIM-XML communication between the CMS and the target node.

# rm /var/opt/wbem/trace/cim*
# cimconfig -s traceComponents=All -c
# cimconfig -s traceLevel=4 -c

'subscribe WBEM events'

# cimconfig -u traceComponents
# cimconfig -u traceLevel

Ask for the contents of the /var/opt/wbem/trace directory.

3. Finally, as Certificate Based Authentication is [allegedly] being used, we should check that the SSL certificates have valid hostnames:

# cimtrust -l
# ssltrustmgr -l -f /etc/opt/hp/sslshare/cert.pem


4. Other things that you might want to try, as they have had success in the past, are:

  • delete and re-add the affected node in HP SIM
  • re-install WBEM Services on the target node

On the latter, the following error is an indication that the WBEM Services installation has a problem:


Troubleshooting

What is the action plan?

1)	cimsub –ls will print all the SIM and WEBES subscriptions, if there are any that is not removed properly.

# cimsub -ra -n root/cimv2 -F <SIM filter name> -H <SIM handler name> 
# cimsub -ra -n root/cimv2 -F <WEBES filter name> -H <WEBES handler name> 

2)	Remove entries from CMS.

CMS> mxwbemsub –r –n archadm.arj.archchemicals.com

3)	Make sure there are no enabled entries shown for:

# cimsub -lf | grep -i webes
# cimsub -lh | grep -i webes
# cimsub -ls | grep -i webes
# cimsub -lf | grep -i sim
# cimsub -lh | grep -i sim
# cimsub -ls | grep -i sim


4)	If any entries show up, use "cimsub -n root/cimv2 -rf <filter name>" for cleaning filter. 

Use "cimsub -n root/cimv2 -rh <handler name>" for cleaning handler files.
 
5)	When it is cleaned up, completely. Bounce the servers.

 CMS> desta stop
 archadm# cimserver -s
 archadm# cimserver
 CMS> desta start
 
6)	Now to subscription again:
CMS:
•	Click [Option]
•	Click [Event]
•	Click [Subscribe event to WBEM]
•	Select archadm
•	Click [Next]
•	Click [Run now]
•	It shows  Subscribe to WBEM Events



WBEM Troubleshooting

Check as well the WTEC WBEM_SERT DIY (Do It Yourself) Triage Page

  • Check wether the cimserver is running

Restart the cimserver

#cimserver -s
#cimserver
 

Run a local osinfo

#osinfo
 

Run a remote osinfo

#osinfo -h <hostname> 

Check that enableHttpConnection is set to true:

/opt/wbem/sbin/cimconfig -lp
sslClientVerificationMode=required
enableSubscriptionsForNonprivilegedUsers=false
shutdownTimeout=30
authorizedUserGroups=
enableRemotePrivilegedUserAccess=true
enableHttpsConnection=true
enableNamespaceAuthorization=false
enableHttpConnection=false

So change the value

#cimconfig -s enableHttpConnection=true -p
#cimserver -s
#cimserver

Info: enableHttpConnection is not a dynamic value. Therefor only the planned configuration can be change and not the current configuration.

  • Check the the Provider Module is loaded and ok:
# cimprovider -ls
 

To load a Provider Module - check the the provider is installed using swlist - do a swverify or reinstall the provider


  • Using cimserver Tracing

Ensure cimserver is running

# ps -ef |grep cimserver

Move or remove any existing trace file

# mv /var/opt/wbem/cimserver.trc /var/opt/wbem/cimserver.trc.prev

Turn on tracing Note: Tracing at level 4 for all components will generate lots of output in the trace file; it is recommended that tracing be turned on only for the minimum time required to reproduce the problem being investigated or exercise the function being analyzed.

# cimconfig -s traceLevel=4 -c
# cimconfig -s traceComponents=ALL -c
Send a test indication 

Steps required to send a test indication depend on the specific indication provider and indication subscription involved; e.g., using sfmconfig -t -p. Analysis of the trace file output will be simplest if a single test indication is sent.

Turn off tracing

# cimconfig -u traceComponents -c
# cimconfig -u traceLevel -c

Alternatively, the cimserver could be stopped and restarted to turn off tracing

# cimserver -s
# cimserver
  • There is a script to gather the most important infos / logs for a WBEM case - this is call wbeminfo.tar

Download wbeminfo.tar from intranet

# tar xvf wbeminfo.tar 
./WbemInfo.sh data_only 
collect the file /tmp/WbemFiles_*.tar.gz

References