Why fc60controller reset.ksh is not longer delivered

From Wiki-UX.info
Jump to: navigation, search

Abstract

Issue Description

The fc60controller_reset.ksh file is a script used to reset the fc60 controller created based on SR 8606414693 (Bug report). It was originally reported as a problem on the fc60mon monitor, but HP laboratories finally determined there was a problem with the hardware itself, specifically on the controller reset process.

The file /etc/opt/resmon/lbin/fc60controller_reset.ksh has been removed from OnlineDiag software bundle. It is not available on version B.11.11.20.03 (A.59.00) for HP-UX 11.11 anymore and likely neither on previous versions down to A.04.20.11 (A.57.00).

File History

Originally delivered to fix the following issues on fc60 monitors, patch name PHSS_32492 (December 2004) superseded by PHSS_33673:

1. DTS JAGaf52955 SR 8606392874

The fc60mon monitor was not reporting disk failure.

2. DTS JAGaf52161 SR 8606392029

The ems event #6 was generated against fc60 even if fc60 had worked fine.

3. DTS JAGaf54126 SR 8606394086

The fc60mon monitor to generate an event prior to controller reset.

4. DTS JAGaf35652 SR 8606375349

The fc60mon monitor is incorrectly reporting event #4.

The script was updated on patch PHSS_34835 (September 2005)

1. DTS JAGaf74551 SR 8606414693

The vxAbsTicks timer/counter is used in the FC60 array controller firmware to test timeout conditions from both SCSI and Fibre Channel commands. When the array is reset, this counter is reset to zero. The issue is if the array is not reset, the controller firmware will roll over or wrap counter value around to zero at 828.5 days after FC60 array is up. When this happens, any SCSI or Fibre Channel command that is in progress will be timed out. If the SCSI command is a disk write, the disk write will fail and the fc60mon reports that disk has failed. Because of this problem multiple drive(on the array) failure notification is send to user even though drives are actually not failed.
  • Note: No additional modifications to the script has been reported since then.

Previous Workarounds

There were some workarounds available to handle the issue. Some of them may still be neccesary depending on the customer configuration:

Preferred Methods

It is recommended that the controllers be reset at least 10 days prior to the actual time (828.5 days) when VxAbsTicks timer will wrap around.

There are several ways to accomplish the reset of the controllers. The best and preferred method is to power cycle the FC60 Controller once every two years. The second method and next best option is to press the reset buttons on each of the FC60 controller boards installed in the FC60 controller once every two years. The last method and least preferred is only for HP-UX systems and consists of running a script that resets FC60 array controllers and modified ‘fc60mon’ monitor code to generate controller reset event (# 38) as per user requirement through configuration file specifically created for this event.

Alternate Methods for HP-UX Systems

Reset of the controllers using a script process

This option is ONLY support on HP-UX systems and is also not the preferred method.

This last and least preferred method is only for HP-UX systems. Use of this method involves running a script that resets FC60 array controllers and modified fc60mon monitor code to generate controller reset event (# 38) as per user requirement through configuration file specifically created for this event.

Reference

Authors