Hi JEFF. I've started a backup from my PC, on a VAX server, and my LAT session has broken. So the process related to that job aborted while backing up to tape. When I got back into the VAX, the process was still active but frozen in a RWAST state. I decided to stop that process, but that didn't work. It's still there. And the tape device is in a "Mounted/Dismount" state. Additional attempts to stop or clear out the process give the message "No such process" even though the SHOW SYSTEM command still shows the process in the RWAST state. What do I do? Yves |
Yves ... Concerning your problem with the tape
drive and it's process in a RWAST state. I've seen it too
many times, and the only solution is to reboot the system.
When this happens, I call it "You R WASTed". It comes
about because of an unusual relation between the physical
operation of a tape drive, and the magnetic tape
driver. Because of the history of tape media,
Digital purposely chose to write the tape driver in such a
way, as to allow users to continue to operate while certain
tape operations are still going on. In other words, they had
to write the driver in such a way that if a user queued an
I/O operation to the tape drive with a WAIT until completed
option, that the user's process did not have to wait until
the physical operation actually completed, as in a rewind
command, which can take several minutes. So even if a user performs a QIO/Wait on
a tape drive, the driver starts the operation, and resumes
the process as if the I/O has completed when, in fact, it
has not. When this happens the driver does alert the process
that the I/O has not completed in the I/O status block. It
then queues up an AST (Asynchronous System Trap) to signal
the process when the operation actually does complete. When
one of these AST's are active, the tape driver will not
execute any more I/O commands until the previous operation
completes. Because of this design, it is possible
for things to get messed up, if the hardware has a problem
when one of these AST's is active. For example, if the tape
drive goes offline. When something unexpected happens, your
process will go into the RWAST state, Ie: waiting for the
Read/Write AST to complete. Now, while the example given
below may not be the exact reason your process went into the
RWAST state, it is the most common reason and a good
example. When a tape drive is issued a rewind
command, the actual I/O operation does not complete until
the hardware detects the BOT (beginning of tape) marker. So
in the case of the rewind operation, the I/O returns with an
incomplete status, even though the operation was queued with
a WAIT option. In this case the tape driver returns control
to the user process, and sets up an AST (Asynchronous System
Trap) to signal actual completion. This prevents future I/O
operations from being executed until the rewind operation
has physically completed successfully. The problem lies in
the fact that device drivers do not execute in the context
of a process, while ASTs do. Now, imagine these following course of
events. A. You are running backup, and the end of
volume is reached. Backup issues a rewind command, followed
by an unload command preparing to continue processing on
another volume. The tape begins to rewind and the AST is
scheduled to allow more IOs once the tape reaches BOT. So
the Unload IO operation is queued, but not executed, pending
completion of the Rewind AST. B. You see the tape rewinding, and press
the online/offline button to take the tape offline. This
prevents the BOT signal from ever getting to the driver. If
you are lucky, when you mount the next tape and press the
online button it will detect the BOT, and continue on with
no problem. However, sometimes the
new BOT condition does not satisfy the rewind completion
because it did not occur due to a rewind command, but rather
your loading the new tape. C. So, your process is still pending the
AST, which seems to never get satisfied. Everything is hung
up, and you cant do anything. So now you try to stop the
process. Open VMS begins to delete the process by doing what
is called a process rundown, which essentially is closing
all files and links to devices. It attempts to terminate the
connection to the tape drive, but the driver is not
accepting any more IO operations, pending the AST
completion. D. The process has been removed from the
active process table, but still appears in the system master
process table in a rundown state still awaiting completion
of a Read/Write AST or RWAST. If you have enough buttons on the tape
drive to manually cause the tape to advance forward while
offline, then manually begin a rewind operation while
offline, and press the online/offline button to bring the
tape back online before it reaches BOT, then you might be
able to satisfy the AST condition, and the process will
finish running down. But since most tape devices these days
do not have this offline capability, you are SOL (Software
Operationally Locked), or I like to say "You R
WASTed". If "You R WASTed", then the only thing
that I know you can do to get your tape drive back is to
reboot the system. Now I'm not saying the above situation is
exactly what happened to you, but evidently something
happened to the tape drive, and/or the process was stopped
prior to the rewind operation completing
normally. The moral to this story is : When a
magtape unit is rewinding due to a software rewind IO
operation, always let the rewind complete before you mess
with any magtape buttons or controls. Also, trying to stop a
process in a RWAST state only makes things worse. Try
shutting down the power to the tape drive, and reset it. But
if that doesn't help ... and you must reboot. Sorry for the bad news. Jeff
All hope is not lost.
Vist Phil
Ottewell's Web Page on the RWAST
state.
You may yet redeem yourself and elevate yourself to level
of "The Man Behind The Curtain" Wizzard of
OpenVMS!
BUT WAIT!
What would you pay to get out of
that pesky RWAST state
without the cost of rebooting?Before you
resign yourself to death at the hands of your users, or
worse, your boss because you have to reboot your
production system just to get access back from a tape
drive.
DCL | Utilities | Management | Tips