Commit Graph

99 Commits (90f944a38c4eb8432b6381fd0b9789f1f4600786)

Author SHA1 Message Date
Swen Schillig ea460a8191 [SCSI] zfcp: Changed D_ID left port disabled
If the destination ID (D_ID) of a remote storage port changed, e.g.
re-plugged cable on the switch in a different switch port, the port
was never (re-)attached within Linux. This patch fixes the broken
mapping between the WWPN and the D_ID.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:16 -05:00
Christof Schmitt dceab655d9 [SCSI] zfcp: Add comments to switch/case fallthroughs
Add comments where there is a deliberate fall through in switch/case
statements. This makes some code checkers happy and makes it clear
that there is no missing break statement.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:15 -05:00
Swen Schillig 94ab4b38b2 [SCSI] zfcp: avoid false ERP complete due to sema race
The ERP thread is performing a task before it is executing the
corresponding down on the semaphore. The response handler of the
just started exchange config should wait for the completion by
performing a down on this semaphore. Since this semaphore is still
positive from the ERP enqueue the handler won't wait and therefore
the exchange config will always fail leaving the adapter in error.
The problem can be solved by performing the down on the semaphore
before starting an ERP task. This is the logically correct order.
Only walk the ERP loop if there is a task to perform.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:28 -05:00
Swen Schillig 828bc1212a [SCSI] zfcp: Set WKA-port to offline on adapter deactivation
The nameserver port might be in state online when the adapter is
offlined. On adapter reactivation the nameserver port is not
re-opened due to the PORT_ONLINE status. This results in an
unsuccessful recovery. In forcing the nameserver port status
to offline on all adapter offline events this issue is prevented.

Waiting for the reference count to drop to zero in
zfcp_wka_port_offline is not required, so remove it.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:27 -05:00
Swen Schillig 92d5193b46 [SCSI] zfcp: Dont block zfcp_wq with scan
When running the scsi_scan from the zfcp workqueue and the target
device does not respond, the zfcp workqueue can block until the
scsi_scan hits a timeout. Move the work to the scsi host workqueue,
since this one is also used for the scan from the SCSI midlayer.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:07:26 -05:00
Swen Schillig 947a9aca86 [SCSI] zfcp: fix queue, scheduled work processing.
Ensure the refcounting is correct even if we were not able to
schedule a work. In addition we have to make sure no scheduled
work is pending while we're dequeing the adapter from the
systems environment.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:22 -05:00
Christof Schmitt a2fa0aede0 [SCSI] zfcp: Block FC transport rports early on errors
Use the I/O blocking mechanism in the FC transport class to allow
faster failovers for multipathing:
- Call fc_remote_port_delete early to set the rport to BLOCKED.
- Check the rport status in queuecommand with fc_remote_portchkready
  to no longer accept new I/O for this port and fail the I/O with the
  appropriate scsi_cmnd result.
- Implement the terminate_rport_io handler to abort all pending I/O
  requests
- Return SCSI commands with DID_TRANSPORT_DISRUPTED while erp is
  running.
- When updating the remote port status, check for late changes and
  update the remote ports status accordingly.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:21 -05:00
Swen Schillig 5ffd51a5e4 [SCSI] zfcp: replace current ERP logging with a more convenient version
The current number based id ERP logging is replaced by a string
based tag version. The benefit is an easier location of the code in
question and the removal of the lengthy array referencing the
individual messages.
The string (7 bytes) based version does not use more space since those
bytes were "used" anyway due to the alignment of the structure.
The encoding of the 7 byte string is as follows
        [0-1] = filename
        [2-5] = task/function
        [6]   = section
Due to the character of this string (fixed length) a string
termination is not required here.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:20 -05:00
Swen Schillig cf13c08223 [SCSI] zfcp: prevent adapter close on initial adapter open
An adapter close was always performed whether it was required,
(e.g. in an error scenario) or not (e.g. initial open).
This patch is changing the process in only doing an
adapter close when it is required.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:20 -05:00
Christof Schmitt 86f8a1b4b4 [SCSI] zfcp: Remove UNIT_REGISTERED status flag
Use the device pointer in zfcp_unit for tracking if we have a
registered SCSI device. With this approach, the flag
ZFCP_STATUS_UNIT_REGISTERED is only redundant and can be removed.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:18 -05:00
Christof Schmitt a5b11dda12 [SCSI] zfcp: Remove some port flags
PORT_PHYS_CLOSING is only set and cleared, but not actually used
for status checking.

PORT_INVALID_WWPN is set when the GID_PN request does not return
a d_id for a remote port, e.g. when a remote port has been
unplugged. For this case, the d_id is zero. In the erp we can
check the d_id and use the normal escalation procedure that gives
up after three retries and remove the special case.

PORT_NO_WWPN is unused: Each port in the remote port list has a
valid wwpn. The WKA ports are now tracked outside the port
list. Remove the PORT_NO_WWPN flag, since this is no longer set
for any port.

Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:18 -05:00
Christof Schmitt b98478d71b [SCSI] zfcp: remove DID_DID flag
The port flag DID_DID indicates whether we know the current id of the
port. This is always set in parallel. Since the id 0 is invalid
(because the port id 0 is invalid) we can remove the DID_DID flag:
d_id of 0 indicates an invalid d_id != 0 is a valid one.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:38:28 -06:00
Heiko Carstens 06499fac65 [SCSI] zfcp: fix compile warning
Get rid of this one:

drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_thread':
drivers/s390/scsi/zfcp_erp.c:1400: warning: ignoring return value of
'down_interruptible', declared with attribute warn_unused_result

zfcp_erp_thread is a kernel thread which can't receive any signals.
So introduce a dummy variable and get rid of the warning.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:36 -06:00
Christof Schmitt ecf39d4212 [S390] convert zfcp printks to pr_xxx macros.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:27 +01:00
Christof Schmitt bd43a42b7e [S390] zfcp: Report microcode level through service level interface
Register zfcp with the new /proc/service_level interface to report the
FCP microcode level. When the adapter goes offline or a channel path
disappears, zfcp unregisters, since the microcode version might change
and zfcp does not know about it.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Swen Schillig fca55b6fb5 [SCSI] zfcp: fix deadlock between wq triggered port scan and ERP
Waiting for the ERP to be finished in a task running in the global
kernel work-queue is a bad idea, especially if the ERP needs to run
another job in this work-queue before it can finish. -> deadlock.

This patch removes the necessity to wait for a finished ERP from the
scan task and moves the job scheduling to the end of the ERP.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:18:04 -06:00
Swen Schillig 26871c97d5 [SCSI] zfcp: verify for correct rport state before scanning for SCSI devs
Prevent a SCSI target scan for a rport which have turned invalid
in the meantime.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:17:34 -06:00
Martin Petermann 7ea633ffad [SCSI] zfcp: fix erp timeout cleanup for port open requests
If an open port fsf request times out (in erp) the
corresponding erp_action member of the fsf
request need to set to NULL. If the port structure
will be removed later-on there will be still a
reference in the fsf request to the non existing
erp_action otherwise.

Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:40 -05:00
Cornelia Huck b9d3aed7e1 [S390] more bus_id -> dev_name conversions
Some further bus_id -> dev_name() conversions in s390 code.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:50 +02:00
Swen Schillig 41bfcf9010 [SCSI] zfcp: fix double dbf id usage
Trace ids 107 and 3 are used twice, fix this to have unique ids for
the erp triggers.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig 091694a556 [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
Due to the character of a scheduled work we cannot guarantee the
LUN register to be finished before an initial device tries to use it.
Therefor we have to wait for PENDING_SCSI_WORK flag to be cleared
before proceeding.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig 9fb3cd86e4 [SCSI] zfcp: fix erp list usage without using locks
The zfcp_erp_thread was using the nolock version of the dbf function.
This resulted in a list access while other tasks could modifying the
list. The symptom was an erp thread running at 100% CPU and never
returning from the dbf function.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:56 -05:00
Swen Schillig e4e9ba5d93 [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
In case of an adapter reopen all rports have to be deleted from the
environment. This should only happen for already registered rports
otherwise fc_remote_port_delete is called with a NULL pointer.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Swen Schillig b7f15f3c94 [SCSI] zfcp: fix deadlock caused by shared work queue tasks
Each adapter reopen trigger automatically a scan_port task which
is waiting for the ERP to be finished before further processing.
Since the initial device setup enqueues adapter, port and LUN which
are individual ERP actions, this process would start after
everything is done. Unfortunately the port_reopen requires another
scheduled work to be finished which is queued after the automatic
scan_port -> deadlock !

This fix creates an own work queue for ERP based nameserver requests.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:55 -05:00
Christof Schmitt 0406289ed5 [SCSI] zfcp: Simplify zfcp data structures
Reduce the size of zfcp data structures by removing unused and
redundant members. scsi_lun is only the mangled version of the
fcp_lun. So, remove the redundant field and use the fcp_lun instead.

Since the queue lock and the pci_batch indicator are only used in the
request queue, move them from the common queue struct to the adapter
struct.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:54 -05:00
Swen Schillig 7ba58c9cc1 [SCSI] zfcp: remove all typedefs and replace them with standards
Remove typedefs from zfcp, use already existing types instead.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:53 -05:00
Swen Schillig 5ab944f97e [SCSI] zfcp: attach and release SAN nameserver port on demand
Changing the zfcp behaviour from always having the nameserver port
open to an on-demand strategy.  This strategy reduces the use of
limited resources like port connections. The patch provides a common
infrastructure which could be used for all WKA ports in future.

Also reduce the number of nameserver lookups by changing the zfcp
behaviour of always querying the nameserver for the corresponding
destination ID of the remote port.  If the destination ID has changed
during the reopen process we will be informed and then trigger a
nameserver query on demand.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:53 -05:00
Swen Schillig 44cc76f2d1 [SCSI] zfcp: remove unused references, declarations and flags
- Remove unused references and declarations, including one instance
   of the FC ls_adisc struct that has been defined twice.
 - Also remove the flags COMMON_OPENING, COMMON_CLOSING,
   ADAPTER_REGISTERED and XPORT_OK that are only set and cleared, but
   not checked anywhere.
 - Remove the zfcp specific atomic_test_mask makro. Simply use
   atomic_read directly instead.
 - Remove the zfcp internal sg helper functions and switch the places
   where it is still used to call sg_virt directly.
 - With the update of the QDIO code, the QDIO data structures no
   longer use the volatile type qualifier. Now we can also remove the
   volatile qualifiers from the zfcp code.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:52 -05:00
Christof Schmitt ff3b24fa53 [SCSI] zfcp: Update message with input from review
Update the kernel messages in zfcp with input from the message review
and remove some messages that have been identified as redundant.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 12:11:52 -05:00
Christof Schmitt 287ac01acf [SCSI] zfcp: Cleanup code in zfcp_erp.c
Cleanup the code in zfcp_erp.c, move erp internal definititions to
this file and move FSF timeout handling to the FSF layer.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:36 -05:00
Swen Schillig 317e6b6519 [SCSI] zfcp: Cleanup of code in zfcp_aux.c
Overall cleanup of zfcp_aux.c to simplify code and follow kernel
coding style.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:35 -05:00
Swen Schillig cc8c282963 [SCSI] zfcp: Automatically attach remote ports
Automatically attach the remote ports in zfcp when the adapter is set
online. This is done by querying all available ports from the FC
namesever. The scan for remote ports is also triggered by RSCNs and
can be triggered manually with the sysfs attribute 'port_rescan'.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:26 -05:00
Christof Schmitt 553448f6c4 [SCSI] zfcp: Message cleanup
Cleanup the messages used in the zfcp driver: Remove unnecessary debug
and trace message and convert the remaining messages to standard
kernel macros. Remove the zfcp message macros and while updating the
whole flie also update the copyright headers.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:26 -05:00
Swen Schillig 00bab91066 [SCSI] zfcp: Cleanup qdio code
Cleanup the interface code from zfcp to qdio. Also move code that
belongs to the qdio interface from the erp to the qdio file.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:25 -05:00
Christof Schmitt 24073b475d [SCSI] zfcp: Move FC code to new file
Move all Fibre Channel related code to new file and cleanup the code
while doing so.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:25 -05:00
Christof Schmitt aa0fec6239 [SCSI] zfcp: Fix sparse warning by providing new entry in dbf
drivers/s390/scsi/zfcp_dbf.c:692:2: warning: context imbalance in
'zfcp_rec_dbf_event_thread' - different lock contexts for basic block

Replace the parameter indicating if the lock is held with a new entry
function that only acquires the lock. This makes the lock handling
more visible and removes the sparse warning.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-06-05 09:27:15 -05:00
Swen Schillig d26ab06ede [SCSI] zfcp: receiving an unsolicted status can lead to I/O stall
Processing of an unsolicted status request can lead to a locking race
of the request_queue's queue_lock during the recreation of the
used up status read request while still in interrupt context
of the response handler.

Detaching the 'refill' of the long running status read requests from
the handler to a scheduled work is solving this issue.

In addition, each refill-run is trying to re-establish the full amount
of status read requests, which might have failed in earlier runs.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-06-05 09:27:13 -05:00
Martin Peschke 1f6f7129eb [SCSI] zfcp: fix 31 bit compile warnings
drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_rscn’:
drivers/s390/scsi/zfcp_aux.c:1379: warning: cast from pointer to integer of
different size
drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_plogi’:
drivers/s390/scsi/zfcp_aux.c:1432: warning: cast from pointer to integer of
different size
drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_logo’:
drivers/s390/scsi/zfcp_aux.c:1457: warning: cast from pointer to integer of
different size
..

Just passing pointers rids us of these warnings and improves readability.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-18 11:50:17 -05:00
Martin Peschke 507e49693a [SCSI] zfcp: Remove obsolete erp_dbf trace
This patch removes the now obsolete erp_dbf trace.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:05 -05:00
Martin Peschke 6f4f365e9c [SCSI] zfcp: Add trace records for recovery actions.
This patch writes trace records for various phases of a recovery action:
action being created, action being processed, action continueing
asynchronously, action gone, action timed out, action dismissed etc.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:05 -05:00
Martin Peschke 9467a9b3ef [SCSI] zfcp: Trace all triggers of error recovery activity
This patch allows any recovery event to be traced back to an exact
cause, e.g. a particular request identified by an id (address).

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:04 -05:00
Martin Peschke 698ec01635 [SCSI] zfcp: Add traces for state changes.
This patch writes a trace record which provides information about state
changes for adapters, ports and units, e.g. target failure, targets becoming
online, targets being temporarily blocked due to pending recovery, targets
which have been recovered successfully etc.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:04 -05:00
Martin Peschke 348447e857 [SCSI] zfcp: Add trace records for recovery thread and its queues
This patch writes trace records which provide information about the
operation of the zfcp error recovery thread and the queues it works
on.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:04 -05:00
Joe Perches 5d67d164e6 [S390] drivers/s390/: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-01-26 14:11:26 +01:00
Christof Schmitt 3f0ca62add [SCSI] zfcp: Hold queue lock when checking port handle for ELS command
We need to hold the queue-lock when checking whether we still have a valid port
handle for the ELS command, i.e whether we can issue this request for this
port. If the error recovery is about to close this port, then it competes for
the queue-lock. If the close request issued by the error recovery wins, then it
is guaranteed that this port has been blocked for other requests.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:29:05 -06:00
Heiko Carstens d1ad09db2f [SCSI] zfcp: fix use after free bug.
zfcp_erp_strategy_check_fsfreq() checks if it is safe to access the
fsf_req associated with the erp_action that gets passed. To test if
it is safe it accesses the fsf_req in order to get its index into
the hash list. This is broken since the fsf_req might be freed already
and the read index has no meaning. It could lead to memory corruption.
Fix this by introducing a new zfcp_reqlist_find_safe() method which
just checks if addresses are equal. This is slower, but only gets
called in case of error recovery.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:29:00 -06:00
Christof Schmitt 1de1b43b5f [SCSI] zfcp: Fix deadlock when adding invalid LUN
When adding an invalid LUN, there is a deadlock between the add
via scsi_scan_target and the slave_destroy handler: The handler
waits for the scan to complete, but for an invalid unit,
scsi_scan_target directly calls the slave_destroy handler.

Fix the deadlock by removing the wait in the slave_destroy
handler, it was not necessary anyway.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:40 -06:00
Christof Schmitt 18edcdbdb2 [SCSI] zfcp: Specify waiting times in ERP in seconds
It is not necessary to use jiffies or milliseconds to specify
waiting times that last a couple of seconds.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:40 -06:00
Martin Peschke 86e8dfc560 [SCSI] zfcp: fix cleanup of dismissed error recovery actions
Calling zfcp_erp_strategy_check_action() after zfcp_erp_action_to_running()
in zfcp_erp_strategy() might cause an unbalanced up() for erp_ready_sem,
which makes the zfcp recovery fail somewhere along the way:

erp thread processing erp_action:
|
|	someone waking up erp thread for erp_action
|	|
|	|		someone else dismissing erp_action:
|	|		|
V	V		V

	write_lock_irqsave(&adapter->erp_lock, flags);
	...
	if (zfcp_erp_action_exists(erp_action) == ZFCP_ERP_ACTION_RUNNING) {
		zfcp_erp_action_to_ready(erp_action);
		up(&adapter->erp_ready_sem);	/* first up() for erp_action */
	}
	write_unlock_irqrestore(&adapter->erp_lock, flags);

write_lock_irqsave(&adapter->erp_lock, flags);
...
zfcp_erp_action_to_running(erp_action);
write_unlock_restore(&adapter->erp_lock, flags);
/* processing erp_action */

			write_lock_irqsave(&adapter->erp_lock, flags);
			...
			erp_action->status |= ZFCP_STATUS_ERP_DISMISSED;
			if (zfcp_erp_action_exists(erp_action) ==
						ZFCP_ERP_ACTION_RUNNING) {
				zfcp_erp_action_to_ready(erp_action);
				up(&adapter->erp_ready_sem);
				/* second, unbalanced up() for erp_action */
			}
			...
			write_unlock_restore(&adapter->erp_lock, flags);

write_lock_irqsave(&adapter->erp_lock, flags);
if (erp_action->status & ZFCP_STATUS_ERP_DISMISSED) {
	zfcp_erp_action_dequeue(erp_action);
	retval = ZFCP_ERP_DISMISSED;
}
...
write_unlock_restore(&adapter->erp_lock, flags);
down(&adapter->erp_ready_sem);
/* this down() is meant to balance the first up() */

The erp thread must not dismiss an erp_action after moving that action to
erp_running_head. Instead it should just go through the down() operation,
which balances the first up(), and run through zfcp_erp_strategy one more
time for the second up(), which eventually cleans up erp_action. Which
is similar to the normal processing of an event for erp_action doing
something asynchronously (e.g. waiting for the completion of an fsf_req).

This only works if we make sure that a dismissed erp_action is passed to
zfcp_erp_strategy() prior to the other action, which caused actions to be
dismissed. Therefore the patch implements this rule: running actions go to
the head of the ready list; new actions go to the tail of the ready list;
the erp thread picks actions to be processed from the ready list's head.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-11-16 13:03:21 -06:00
Martin Peschke d0076f7754 [SCSI] zfcp: fix dismissal of error recovery actions
zfcp_erp_action_dismiss() used to ignore any actions in the ready list. This
is a bug. Any action superseded by a stronger action needs to be dismissed.
This patch changes zfcp_erp_action_dismiss() so that it dismisses actions
regardless of their list affiliation. The ERP thread is able to handle this.
It is important to kick the erp thread only for actions in the running list,
though, as an imbalance of wakeup signals would confuse the erp thread
otherwise.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-11-16 13:02:57 -06:00